Posts in Security (20 found)
iDiallo Yesterday

Back button hijacking is going away

When websites are blatantly hostile, users close them to never come back. Have you ever downloaded an app, realized it was deceptive, and deleted it immediately? It's a common occurrence for me. But there is truly hostile software that we still end up using daily. We don't just delete those apps because the hostility is far more subtle. It's like the boiling frog, the heat turns up so slowly that the frog enjoys a nice warm bath before it's fully cooked. With clever hostile software, they introduce one frustrating feature at a time. Every time I find myself on LinkedIn, it's not out of pleasure. Maybe it's an email about an enticing job. Maybe it's an article someone shared with me. Either way, before I click the link, I have no intention of scrolling through the feed. Yet I end up on it anyway, not because I want to, but because I've been tricked. You see, LinkedIn employs a trick called back button hijacking. You click a LinkedIn URL that a friend shared, read the article, and when you're done, you click the back button expecting to return to whatever app you were on before. But instead of going back, you're still on LinkedIn. Except now, you are on the homepage, where your feed loads with enticing posts that lure you into scrolling. How did that happen? How did you end up on the homepage when you only clicked on a single link? That's back button hijacking. Here's how it works. When you click the original LinkedIn link, you land on a page and read the article. In the background, LinkedIn secretly gets to work. Using the JavaScript method, it swaps the page's URL to the homepage. The method doesn't add an entry to the browser's history. Then LinkedIn manually pushes the original URL you landed on into the history stack. This all happens so fast that the user never notices any change in the URL or the page. As far as the browser is concerned, you opened the LinkedIn homepage and then clicked on a post to read it. So when you click the back button, you're taken back to the homepage, the feed loads, and you're presented with the most engaging post to keep you on the platform. If you spent a few minutes reading the article, you probably won't even remember how you got to the site. So when you click back and see the feed, you won't question it. You'll assume nothing deceptive happened. While LinkedIn only pushes you one level down in the history state, more aggressive websites can break the back button entirely. They push a new history state every time you try to go back, effectively trapping you on their site. In those cases, your only option is to close the tab. I've also seen developers unintentionally break the back button, often when implementing a search feature. On a search box where each keystroke returns a result, an inexperienced developer might push a new history state on every keystroke, intending to let users navigate back to previous search terms. Unfortunately, this creates an excessive number of history entries. If you typed a long search query, you'd have to click the back button for every character (including spaces) just to get back to the previous page. The correct approach is to only push the history state when the user submits or leaves the search box ( ). As of yesterday, Google announced a new spam policy to address this issue. Their reasoning: People report feeling manipulated and eventually less willing to visit unfamiliar sites. As we've stated before, inserting deceptive or manipulative pages into a user's browser history has always been against our Google Search Essentials. Any website using these tactics will be demoted in search results: Pages that are engaging in back button hijacking may be subject to manual spam actions or automated demotions, which can impact the site's performance in Google Search results. To give site owners time to make any needed changes, we're publishing this policy two months in advance of enforcement on June 15, 2026. I'm not sure how much search rankings affect LinkedIn specifically, but in the grand scheme of things, this is a welcome change. I hope this practice is abolished entirely.

0 views
Neil Madden Yesterday

Mythos and its impact on security

I’m sure by now you’ve all read the news about Anthropic’s new “Mythos” model and its apparently “dangerous” capabilities in finding security vulnerabilities. I’m sure everyone reading this also has opinions about that. Well, here are a few of mine. Firstly, it’s tempting to dismiss the announcement as pure marketing hype. Anthropic are rumoured to be approaching IPO , so obviously a lot of hype is expected, and we’ve seen the “dangerous” card played before with GPT-2 . Throughout the history of computer security, both new tools and security researchers themselves have often been branded as dangerous or irresponsible. If you have sympathy for this viewpoint, then I hate to break it to you: Mythos is not inserting vulnerabilities into software, they were there all along. (Vibe-coding notwithstanding). That’s not to say that Mythos doesn’t represent a potentially interesting breakthrough. (Although apparently many existing small models are able to reproduce its findings, at least in part). And that’s not to say that releasing Mythos would not have some risk: potentially quite a large risk in some cases, and its ability to synthesise actual exploits is concerning. All security tools that find vulnerabilities come with a risk, but they also come with an upside: letting defenders find vulnerabilities too (ideally, first). Anthropic quote costs of around $10,000–20,000 for each vulnerability they found. You can quibble around those costs and I’m sure they’ll come down over time, but at the moment I think it’s fair to say that this won’t be run over every single software project out there. If it’s going to be used by bad actors, then it’ll probably still be somewhat targeted at high-impact systems. I’m sure we’ll see some new zero-day exploits of edge devices and probably an uptick in ransomware attacks, but it’s not like edge devices don’t regularly get exploited anyway . (Spoiler: many security products are shockingly poorly designed and implemented). But on the plus side, I can see Mythos and similar models being an excellent add-on to your annual pentest engagement. At those costs, you’re not going to run it on every build pipeline, and there’s probably going to be a certain amount of expertise required to get the most from it in a limited budget. As with all new tools, eventually the findings will plateau. There’s only so many times you can run the same tool over the same source code and come up with new findings. That’s not to say that there won’t still be vulnerabilities (there almost certainly will), but just that the tool will not be able to find them. As a former AI researcher myself (before modern ML exploded), I find this aspect of the Mythos write-up quite interesting. Most security tools suffer from problems with false positives, and LLMs are of course famous for that: they are “bullshit machines” . Putting it in slightly less pejorative terms, I would call them abduction machines: they generate plausible hypotheses to explain some set of observations. (Training an LLM is induction, but what they do at runtime is closer to abduction). In the case of a chatbot, the “observations” are the token context window, and the hypotheses are the plausible next token completions. In the case of vulnerability hunting, the observations are the source code and a prompt asking to look for a vulnerability and the hypotheses are the generated potential vulnerabilities. Despite knowing how this works, it is still kind of magic to me that the latter emerges from the former (plausible vulnerabilities from merely predicting the most likely next tokens given the context). Broadly speaking, the better the model the more likely those hypotheses are to be accurate. But they are still wrong an awful lot of the time, and false positives are the death of productivity. We’ve all seen reports of open source projects being overwhelmed by “slop” AI-generated vulnerability reports . But recently, that seems to have changed and a larger quantity of high-quality reports are being submitted to many high-profile projects. What changed? I think the clue is front and centre in Anthropic’s write-up: use of Address Sanitizer (ASan) as an oracle to weed-out false positives. I think this is a crucial dividing line that separates successful from unsuccessful uses of AI. This is why “agentic” (grr) AI is relatively successful at software development. The models aren’t inherently much better at writing code than any other task, but there already exists a large body of automated “bullshit correctors”: type checkers, linters, automated test suites, etc. (Many of which use techniques from earlier waves of “symbolic” AI research, just saying…) These oracles provide a clear signal about whether a hypothesis generated by the LLM is bullshit or not. (I would hypothesise that LLMs are likely to produce better code in languages with more sophisticated type systems). Hence why we see quite a lot of progress and marketing for AI systems in such use-cases, despite those markets being relatively small compared to the AI company’s massive valuations and funding. I’m guessing investors are not going to be pleased to have stumped up billions for a slice of a dev tools company? But software development does seem somewhat unique in this regard. Getting back to vulnerability hunting and oracles. This is the same situation that fuzzers face: a fuzzer is generally only really good at finding vulnerabilities when there is a good oracle to decide if you’ve found one or not (PDF link). Like Mythos, fuzzers are very good at finding crashes and (via oracles like ASan) memory safety issues, but they are not going to find subtle violations of user expectations. Mythos is clearly more than just a fuzzer though, it’s also looking at the source code and doing somewhat sophisticated “analysis” of potential weaknesses. But I think the problem of needing an oracle will remain. Without an oracle, I’m sure that Mythos would still find genuine vulnerabilities, but they will be overwhelmed by slop false positives, which will drown out the signal in the noise. For me then, I think this is the most interesting open question for LLM-based vulnerability finders: which classes of bugs can we write (or train) good oracles for? I think potentially quite a lot, but definitely not everything. I think humans will still have an edge in finding complex bugs for a long time to come. Obviously Anthropic’s take is that you should use AI tools to find all the bugs first. You could dismiss that as an obvious attempt to cash-in, and it even has shades of a protection racket . But I do think that Mythos and models like it are probably worth using, as an add-on to a human penetration test or similar. But really Mythos is just the latest tool revealing the continuing poor state of software security. Such tools continue to find a frightening array of vulnerabilities, because there are a terrifying quantity of them out there, and we keep adding more. The situation is not really going to be improved by throwing more AI at the problem. If anything, the rise of vibe-coding is likely to be increasing this trend. As I’ve covered here before, even apparent experts write total garbage security services when assisted by an LLM. If you want secure software, you need to slow down and think carefully and deeply, not rush ever faster to churn out more and more junk software. In the short term, we can just keep doing the things we know how to do: thinking about security earlier in the design process, incorporating basic security tools and testing into build pipelines, ensuring you can patch CVEs quickly (but not too quickly ), etc etc. Longer-term, we know how to solve some classes of vulnerabilities altogether. For example, we know that memory-safe programming languages eliminate whole swathes of potential issues, including many of the sort that Mythos is good at discovering. We’ve known this for decades but still write lots of software in unsafe languages. Numerous reports and government proclamations are slowly shifting that, but we still have a very long way to go. Capability-based security would solve many other classes of vulnerabilities, whether in CPUs , Operating Systems, or in supply chains . These are not easy fixes, and would require a massive investment over many years. Profit-driven companies are not going to pursue them without regulatory pressure, and are largely going in the opposite direction. Such fundamental changes would not solve everything: there will still be vulnerabilities, but fewer and of less severity if we do it right. Strong security mechanisms can provide a multiplicative reduction in risk . But if we really do want more secure software, and a foundation to our digital society that we can actually trust, then I don’t see an alternative. Finding and fixing individual vulnerabilities will never deliver that, however good the tools get.

0 views
Daniel Mangum 5 days ago

PSA Crypto: The P is for Portability

Arm’s Platform Security Architecture (PSA) was released in 2017, but it was two years until the first beta release of the PSA Cryptography API in 2019, and another year until the 1.0 specification in 2020. Aimed at securing connected devices and originally targeting only Arm-based systems, PSA has evolved with the donation of the PSA Certified program to GlobalPlatform in 2025, allowing non-Arm devices, such as popular RISC-V microcontrollers (MCUs), to achieve certification.

0 views

Has Mythos just broken the deal that kept the internet safe?

For nearly 20 years the deal has been simple: you click a link, arbitrary code runs on your device, and a stack of sandboxes keeps that code from doing anything nasty. Browser sandboxes for untrusted JavaScript, VM sandboxes for multi-tenant cloud, ad iframes so banner creatives can't take over your phone or laptop - the modern internet is built on the assumption that those sandboxes hold. Anthropic just shipped a research preview that generates working exploits for one of them 72.4% of the time, up from under 1% a few months ago. That deal might be breaking. From what I've read Mythos is a very large model. Rumours have pointed to it being similar in size to the short lived (and very underwhelming) GPT4.5 . As such I'm with a lot of commentators in thinking that a primary reason this hasn't been rolled out further is compute. Anthropic is probably the most compute starved major AI lab right now and I strongly suspect they do not have the compute to roll this out even if they wanted more broadly. From leaked pricing, it's expensive as well - at $125/MTok output (5x more than Opus, which is itself the most expensive model out there). One thing that has really been overlooked with all the focus on frontier scale models is how quickly improvements in the huge models are being achieved on far smaller models. I've spent a lot of time with Gemma 4 open weights model, and it is incredibly impressive for a model that is ~50x smaller than the frontier models. So I have no doubt that whatever capabilities Mythos has will relatively quickly be available in smaller, and thus easier to serve, models. And even if Mythos' huge size somehow is intrinsic to the abilities (I very much doubt this, given current progress in scaling smaller models) it has, it's only a matter of time before newer chips [1] are able to serve it en masse. It's important to look to where the puck is going. As I've written before, LLMs in my opinion pose an extremely serious cybersecurity risk. Fundamentally we are seeing a radical change in how easy it is to find (and thus exploit) serious flaws and bugs in software for nefarious purposes. To back up a step, it's important to understand how modern cybersecurity is currently achieved. One of the most important concepts is that of a sandbox . Nearly every electronic device you touch day to day has one (or many) layers of these to protect the system. In short, a sandbox is a so called 'virtualised' environment where software can execute on the system, but with limited permissions, segregated from other software, with a very strong boundary that protects the software 'breaking out' of the sandbox. If you're reading this on a modern smartphone, you have at least 3 layers of sandboxing between this page and your phone's operating system. First, your browser has (at least) two levels of sandboxing. One is for the JavaScript execution environment (which runs the interactive code on websites). This is then sandboxed by the browser sandbox, which limits what the site as a whole can do. Finally, iOS or Android then has an app sandbox which limits what the browser as a whole can do. This defence in depth is absolutely fundamental to modern information security, especially allowing users to browse "untrusted" websites with any level of security. For a malicious website to gain control over your device, it needs to chain together multiple vulnerabilities, all at the same time. In reality this is extremely hard to do (and these kinds of chains fetch millions of dollars on the grey market ). Guess what? According to Anthropic, Mythos Preview successfully generates a working exploit for Firefox's JS shell in 72.4% of trials. Opus 4.6 managed this in under 1% of trials in a previous evaluation: Worth flagging a couple of caveats. The JS shell here is Firefox's standalone SpiderMonkey - so this is escaping the innermost sandbox layer, not the full browser chain (the renderer process and OS app sandbox still sit on top). And it's Anthropic's own benchmark, not an independent one. But even hedging both of those, the trajectory is what matters - we're going from "effectively zero" to "72.4% of the time" in one model generation, on a real-world target rather than a toy CTF. This is pretty terrifying if you understand the implications of this. If an LLM can find exploits in sandboxes - which are some of the most well secured pieces of software on the planet - then suddenly every website you aimlessly browse through could contain malicious code which can 'escape' the sandbox and theoretically take control of your device - and all the data on your phone could be sent to someone nasty. These attacks are so dangerous because the internet is built around sandboxes being safe. For example, each banner ad your browser loads is loaded in a separate sandboxed environment. This means they can run a huge amount of (mostly) untested code, with everyone relying on the browser sandbox to protect them. If that sandbox falls, then suddenly a malicious ad campaign can take over millions of devices in hours. Equally, sandboxes (and virtualisation) are fundamental to allowing cloud computing to operate at scale. Most servers these days are not running code against the actual server they are on. Instead, AWS et al take the physical hardware and "slice" it up into so called "virtual" servers, selling each slice to different customers. This allows many more applications to run on a single server - and enables some pretty nice profit margins for the companies involved. This operates on roughly the same model as your phone, with various layers to protect customers from accessing each other's data and (more importantly) from accessing the control plane of AWS. So, we have a very, very big problem if these sandboxes fail, and all fingers point towards this being the case this year. I should tone down the disaster porn slightly - there have been many sandbox escapes before that haven't caused chaos, but I have a strong feeling that this is going to be difficult. And to be clear, when just AWS us-east-1 goes down (which it has done many , many , times ) it is front page news globally and tends to cause significant disruption to day to day life. This is just one of AWS's data centre zones - if a malicious actor was able to take control of the AWS control plane it's likely they'd be able to take all regions simultaneously, and it would likely be infinitely harder to restore when a bad actor was in charge, as opposed to the internal problems that have caused previous problems - and been extremely difficult to restore from in a timely way. Given all this it's understandable that Anthropic are being cautious about releasing this in the wild. The issue though, is that the cat is out of the bag. Even if Anthropic pulled a Miles Dyson and lowered their model code into a pit of molten lava, someone else is going to scale an RL model and release it. The incentives are far, far too high and the prisoner's dilemma strikes again. The current status quo seems to be that these next generation models will be released to a select group of cybersecurity professionals and related organisations, so they can fix things as much as possible to give them a head start. Perhaps this is the best that can be done, but this seems to me to be a repeat of the famous "obscurity is not security" approach which has become a meme in itself in the information security world. It also seems far fetched to me that these organisations who do have access are going to find even most of the critical problems in a limited time window. And that brings me to my final point. While Anthropic are providing $100m of credit and $4m of 'direct cash donations' to open source projects, it's not all open source projects. There are a lot of open source projects that everyone relies on without realising. While the obvious ones like the Linux kernel are getting this "access" ahead of time, there are literally millions of pieces of open source software (nevermind commercial software) that are essential for a substantial minority of systems operation. I'm not quite sure where the plan leaves these ones. Perhaps this is just another round in the cat and mouse cycle that reaches a mostly stable equilibrium, and at worst we have some short term disruption. But if I step back and look how fast the industry has moved over the past few years - I'm not so sure. And one thing I think is for certain, it looks like we do now have the fabled superhuman ability in at least one domain. I don't think it's the last. Albeit at the cost of adding yet more pressure onto the compute crunch the AI industry is experiencing ↩︎ Albeit at the cost of adding yet more pressure onto the compute crunch the AI industry is experiencing ↩︎

0 views
Martin Fowler 6 days ago

Fragments: April 9

I mostly link to written material here, but I’ve recently listened to two excellent podcasts that I can recommend. Anyone who regularly reads these fragments knows that I’m a big fan of Simon Willison, his (also very fragmentary) posts have earned a regular spot in my RSS reader. But the problem with fragments, however valuable, is that they don’t provide a cohesive overview of the situation. So his podcast with Lenny Rachitsky is a welcome survey of that state of world as seen through a discerning pair of eyeballs. He paints a good picture of how programming has changed for him since the “November inflection point”, important patterns for this work, and his concern about the security bomb nestled inside the beast. My other great listening was on a regular podcast that I listen to, as Gergely Orosz interviewed Thuan Pham - the former CTO of Uber. As with so many of Gergely’s podcasts, they focused on Thuan Pham’s fascinating career direction, giving listeners an opportunity to learn from a successful professional. There’s also an informative insight into Uber’s use of microservices (they had 5000 of them), and the way high-growth software necessarily gets rewritten a lot (a phenomenon I dubbed Sacrificial Architecture ) ❄                ❄                ❄                ❄                ❄ Axios published their post-mortem on their recent supply chain compromise . It’s quite a story, the attackers spent a couple of weeks developing contact with the lead maintainer, leading to a video call where the meeting software indicated something on the maintainer’s system was out of date. That led to the maintainer installing the update, which in fact was a Remote Access Trojan (RAT). they tailored this process specifically to me by doing the following: Simon Willison has a summary and further links . ❄                ❄                ❄                ❄                ❄ I recently bumped into Diátaxis , a framework for organizing technical documentation. I only looked at it briefly, but there’s much to like. In particular I appreciated how it classified four forms of documentation: The distinction between tutorials and how-to guides is interesting A tutorial serves the needs of the user who is at study. Its obligation is to provide a successful learning experience. A how-to guide serves the needs of the user who is at work. Its obligation is to help the user accomplish a task. I also appreciated its point of pulling explanations out into separate areas. The idea is that other forms should contain only minimal explanations, linking to the explanation material for more depth. That way we keep the flow on the goal and allow the user to seek deeper explanations in their own way. The study/work distinction between explanation and reference mirrors that same distinction between tutorials and how-to guides. ❄                ❄                ❄                ❄                ❄ For eight years, Lalit Maganti wanted a set of tools for working with SQLite. But it would be hard and tedious work, “getting into the weeds of SQLite source code, a fiendishly difficult codebase to understand”. So he didn’t try it. But after the November inflection point , he decided to tackle this need. His account of this exercise is an excellent description of the benefits and perils of developing with AI agents. Through most of January, I iterated, acting as semi-technical manager and delegating almost all the design and all the implementation to Claude. Functionally, I ended up in a reasonable place: a parser in C extracted from SQLite sources using a bunch of Python scripts, a formatter built on top, support for both the SQLite language and the PerfettoSQL extensions, all exposed in a web playground. But when I reviewed the codebase in detail in late January, the downside was obvious: the codebase was complete spaghetti. I didn’t understand large parts of the Python source extraction pipeline, functions were scattered in random files without a clear shape, and a few files had grown to several thousand lines. It was extremely fragile; it solved the immediate problem but it was never going to cope with my larger vision, never mind integrating it into the Perfetto tools. The saving grace was that it had proved the approach was viable and generated more than 500 tests, many of which I felt I could reuse. He threw it all away and worked more closely with the AI on the second attempt, with lots of thinking about the design, reviewing all the code, and refactoring with every step In the rewrite, refactoring became the core of my workflow. After every large batch of generated code, I’d step back and ask “is this ugly?” Sometimes AI could clean it up. Other times there was a large-scale abstraction that AI couldn’t see but I could; I’d give it the direction and let it execute. If you have taste, the cost of a wrong approach drops dramatically because you can restructure quickly. He ended up with a working system, and the AI proved its value in allowing him to tackle something that he’d been leaving on the todo pile for years. But even with the rewrite, the AI had its potholes. His conclusion of the relative value of AI in different scenarios: When I was working on something I already understood deeply, AI was excellent…. When I was working on something I could describe but didn’t yet know, AI was good but required more care…. When I was working on something where I didn’t even know what I wanted, AI was somewhere between unhelpful and harmful… At the heart of this is that AI works at its best when there is an objectively checkable answer. If we want an implementation that can pass some tests, then AI does a good job. But when it came to the public API: I spent several days in early March doing nothing but API refactoring, manually fixing things any experienced engineer would have instinctively avoided but AI made a total mess of. There’s no test or objective metric for “is this API pleasant to use” and “will this API help users solve the problems they have” and that’s exactly why the coding agents did so badly at it. ❄                ❄                ❄                ❄                ❄ I became familiar with Ryan Avent’s writing when he wrote the Free Exchange column for The Economist. His recent post talks about how James Talarico and Zohran Mamdani have made their religion an important part of their electoral appeal, and their faith is centered on caring for others. He explains that a focus on care leads to an important perspective on economic growth. The first thing to understand is that we should not want growth for its own sake. What is good about growth is that it expands our collective capacities: we come to know more and we are able to do more. This, in turn, allows us to alleviate suffering, to discover more things about the universe, and to spend more time being complete people. they reached out masquerading as the founder of a company they had cloned the companys founders likeness as well as the company itself. they then invited me to a real slack workspace. this workspace was branded to the companies ci and named in a plausible manner. the slack was thought out very well, they had channels where they were sharing linked-in posts, the linked in posts i presume just went to the real companys account but it was super convincing etc. they even had what i presume were fake profiles of the team of the company but also number of other oss maintainers. they scheduled a meeting with me to connect. the meeting was on ms teams. the meeting had what seemed to be a group of people that were involved. the meeting said something on my system was out of date. i installed the missing item as i presumed it was something to do with teams, and this was the RAT. everything was extremely well co-ordinated looked legit and was done in a professional manner. Tutorials: to learn how to use the product How-to guides: for users to follow to achieve particular goals with the product Reference: to describe what the product does Explanations: background and context to educate the user on the product’s rationale

0 views
neilzone 1 weeks ago

Thoughts on increasing ssh security using a hardware security key

I have been using hardware security keys (including YubiKeys and Titan keys) for FIDO2 and TOTP for a while, but not for ssh. At the moment, I harden the ssh config on my servers, lock down access by IP address, and use password-protected certificates for authentication, blocking password-based authentication. So I think that I do at least reasonably well as it is. But I was interested to see if I could introduce a further aspect of security for ssh, using a security key. My security keys support the generation of both resident and non-resident keys. Resident keys are stored on a slot on the YubiKey, while non-resident keys are stored on the client computer, but require the YubiKey. I picked non-resident. I set a passphrase as part of the ssh-keygen process, so, when it comes to using that key, I need to enter that passphrase and insert and touch the security key. So now someone would need: I can, I think, add a PIN to the YubiKey but, to date, I have not done this. Perhaps I should. Honestly, I was probably fine without this, but, well, I had the security keys, so why not. But, while this works fine from my laptop, I can’t get it to work on my phone (GrapheneOS). At the moment, I use Termux, and from there, I can ssh in to my servers. But I can’t get Termux to use my _*_-sk keypair. There is a six year old issue in the Termux Github repo which indicates that it might, some point, be coming, and that would be welcome. Apparently it can be done using a closed source tool, but since I’m only looking to use FOSS, that’s not on the cards for me. So that is a bit of a pain, as it is convenient to be able to log in from my phone from time to time. to be connected to the correct network to have a copy of my private key to know the passphrase for that private key to have one of my security keys (my main security key, and my backup security key)

0 views
Kev Quirk 1 weeks ago

Obfuscating My Contact Email

I stumbled across this great post by Spencer Mortensen yesterday, which tested different email obfuscation techniques against real spambots to see which ones actually work. It's a fascinating read, and I'd recommend checking it out if you're into that sort of thing. The short version is that spambots scrape your HTML looking for email addresses. If your address is sitting there in plain text, they'll hoover it up. But if you encode each character as a HTML entity , the browser still renders and uses it correctly, while most bots haven't got a clue what they're looking at. From Spencer's testing, this approach blocks around 95% of harvesters, which is good enough for me. On this site, my contact email shows up in two places: Both pull from the value in Pure Blog's config, so I only needed to make a couple of changes. The reply button lives in , which is obviously a PHP file. So the fix there was straightforward - I ditched the shortcode and used PHP directly to encode the address character by character into HTML entities: Each character becomes something like , which is gibberish to a bot, but perfectly readable to a human using a browser. The shortcode still gets replaced normally by Pure Blog after the PHP runs, so the subject line still works as expected. The contact page is a normal page in Pure Blog, so it's Markdown under the hood. This means I can't drop PHP into it. Instead, I used Pure Blog's hook , which runs after shortcodes have already been processed. By that point, has been replaced with the plain email address, so all I needed to do was swap it for the encoded version: This goes in , and now any page content that passes through Pure Blog's function will have the email automatically encoded. So if I decide to publish my elsewhere, it should automagically work. As well as the obfuscation, I also set up my email address as a proper alias rather than relying on a catch-all to segregate emails . That way, if spam does somehow get through, I can nuke the alias, create a new one, and update it in Pure Blog's settings page. Is this overkill? Probably. But it was a fun little rabbit hole, and now I can feel smug about it. 🙃 Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment . The Reply by email button at the bottom of every post. My contact page .

0 views
Simon Willison 1 weeks ago

Anthropic's Project Glasswing - restricting Claude Mythos to security researchers - sounds necessary to me

Anthropic didn't release their latest model, Claude Mythos ( system card PDF ), today. They have instead made it available to a very restricted set of preview partners under their newly announced Project Glasswing . The model is a general purpose model, similar to Claude Opus 4.6, but Anthropic claim that its cyber-security research abilities are strong enough that they need to give the software industry as a whole time to prepare. Mythos Preview has already found thousands of high-severity vulnerabilities, including some in every major operating system and web browser . Given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely. Project Glasswing partners will receive access to Claude Mythos Preview to find and fix vulnerabilities or weaknesses in their foundational systems—systems that represent a very large portion of the world’s shared cyberattack surface. We anticipate this work will focus on tasks like local vulnerability detection, black box testing of binaries, securing endpoints, and penetration testing of systems. There's a great deal more technical detail in Assessing Claude Mythos Preview’s cybersecurity capabilities on the Anthropic Red Team blog: In one case, Mythos Preview wrote a web browser exploit that chained together four vulnerabilities, writing a complex  JIT heap spray  that escaped both renderer and OS sandboxes. It autonomously obtained local privilege escalation exploits on Linux and other operating systems by exploiting subtle race conditions and KASLR-bypasses. And it autonomously wrote a remote code execution exploit on FreeBSD's NFS server that granted full root access to unauthenticated users by splitting a 20-gadget ROP chain over multiple packets. Plus this comparison with Claude 4.6 Opus: Our internal evaluations showed that Opus 4.6 generally had a near-0% success rate at autonomous exploit development. But Mythos Preview is in a different league. For example, Opus 4.6 turned the vulnerabilities it had found in Mozilla’s Firefox 147 JavaScript engine—all patched in Firefox 148—into JavaScript shell exploits only two times out of several hundred attempts. We re-ran this experiment as a benchmark for Mythos Preview, which developed working exploits 181 times, and achieved register control on 29 more. Saying "our model is too dangerous to release" is a great way to build buzz around a new model, but in this case I expect their caution is warranted. Just a few days ( last Friday ) ago I started a new ai-security-research tag on this blog to acknowledge an uptick in credible security professionals pulling the alarm on how good modern LLMs have got at vulnerability research. Greg Kroah-Hartman of the Linux kernel: Months ago, we were getting what we called 'AI slop,' AI-generated security reports that were obviously wrong or low quality. It was kind of funny. It didn't really worry us. Something happened a month ago, and the world switched. Now we have real reports. All open source projects have real reports that are made with AI, but they're good, and they're real. Daniel Stenberg of : The challenge with AI in open source security has transitioned from an AI slop tsunami into more of a ... plain security report tsunami. Less slop but lots of reports. Many of them really good. I'm spending hours per day on this now. It's intense. And Thomas Ptacek published Vulnerability Research Is Cooked , a post inspired by his podcast conversation with Anthropic's Nicholas Carlini. Anthropic have a 5 minute talking heads video describing the Glasswing project. Nicholas Carlini appears as one of those talking heads, where he said (highlights mine): It has the ability to chain together vulnerabilities. So what this means is you find two vulnerabilities, either of which doesn't really get you very much independently. But this model is able to create exploits out of three, four, or sometimes five vulnerabilities that in sequence give you some kind of very sophisticated end outcome. [...] I've found more bugs in the last couple of weeks than I found in the rest of my life combined . We've used the model to scan a bunch of open source code, and the thing that we went for first was operating systems, because this is the code that underlies the entire internet infrastructure. For OpenBSD, we found a bug that's been present for 27 years, where I can send a couple of pieces of data to any OpenBSD server and crash it . On Linux, we found a number of vulnerabilities where as a user with no permissions, I can elevate myself to the administrator by just running some binary on my machine. For each of these bugs, we told the maintainers who actually run the software about them, and they went and fixed them and have deployed the patches patches so that anyone who runs the software is no longer vulnerable to these attacks. I found this on the OpenBSD 7.8 errata page : 025: RELIABILITY FIX: March 25, 2026 All architectures TCP packets with invalid SACK options could crash the kernel. A source code patch exists which remedies this problem. I tracked that change down in the GitHub mirror of the OpenBSD CVS repo (apparently they still use CVS!) and found it using git blame : Sure enough, the surrounding code is from 27 years ago. I'm not sure which Linux vulnerability Nicholas was describing, but it may have been this NFS one recently covered by Michael Lynch . There's enough smoke here that I believe there's a fire. It's not surprising to find vulnerabilities in decades-old software, especially given that they're mostly written in C, but what's new is that coding agents run by the latest frontier LLMs are proving tirelessly capable at digging up these issues. I actually thought to myself on Friday that this sounded like an industry-wide reckoning in the making, and that it might warrant a huge investment of time and money to get ahead of the inevitable barrage of vulnerabilities. Project Glasswing incorporates "$100M in usage credits ... as well as $4M in direct donations to open-source security organizations". Partners include AWS, Apple, Microsoft, Google, and the Linux Foundation. It would be great to see OpenAI involved as well - GPT-5.4 already has a strong reputation for finding security vulnerabilities and they have stronger models on the near horizon. The bad news for those of us who are not trusted partners is this: We do not plan to make Claude Mythos Preview generally available, but our eventual goal is to enable our users to safely deploy Mythos-class models at scale—for cybersecurity purposes, but also for the myriad other benefits that such highly capable models will bring. To do so, we need to make progress in developing cybersecurity (and other) safeguards that detect and block the model’s most dangerous outputs. We plan to launch new safeguards with an upcoming Claude Opus model, allowing us to improve and refine them with a model that does not pose the same level of risk as Mythos Preview. I can live with that. I think the security risks really are credible here, and having extra time for trusted teams to get ahead of them is a reasonable trade-off. You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options .

0 views

Russia Hacked Routers to Steal Microsoft Office Tokens

Hackers linked to Russia’s military intelligence units are using known flaws in older Internet routers to mass harvest authentication tokens from Microsoft Office users, security experts warned today. The spying campaign allowed state-backed Russian hackers to quietly siphon authentication tokens from users on more than 18,000 networks without deploying any malicious software or code. Microsoft said in a blog post today it identified more than 200 organizations and 5,000 consumer devices that were caught up in a stealthy but remarkably simple spying network built by a Russia-backed threat actor known as “ Forest Blizzard .” How targeted DNS requests were redirected at the router. Image: Black Lotus Labs. Also known as APT28 and Fancy Bear, Forest Blizzard is attributed to the military intelligence units within Russia’s General Staff Main Intelligence Directorate (GRU). APT 28 famously compromised the Hillary Clinton campaign, the Democratic National Committee, and the Democratic Congressional Campaign Committee in 2016 in an attempt to interfere with the U.S. presidential election. Researchers at Black Lotus Labs , a security division of the Internet backbone provider Lumen , found that at the peak of its activity in December 2025, Forest Blizzard’s surveillance dragnet ensnared more than 18,000 Internet routers that were mostly unsupported, end-of-life routers, or else far behind on security updates. A new report from Lumen says the hackers primarily targeted government agencies—including ministries of foreign affairs, law enforcement, and third-party email providers. Black Lotus Security Engineer Ryan English said the GRU hackers did not need to install malware on the targeted routers, which were mainly older Mikrotik and TP-Link devices marketed to the Small Office/Home Office (SOHO) market. Instead, they used known vulnerabilities to modify the Domain Name System (DNS) settings of the routers to include DNS servers controlled by the hackers. As the U.K.’s National Cyber Security Centre (NCSC) notes in a new advisory detailing how Russian cyber actors have been compromising routers, DNS is what allows individuals to reach websites by typing familiar addresses, instead of associated IP addresses. In a DNS hijacking attack, bad actors interfere with this process to covertly send users to malicious websites designed to steal login details or other sensitive information. English said the routers attacked by Forest Blizzard were reconfigured to use DNS servers that pointed to a handful of virtual private servers controlled by the attackers. Importantly, the attackers could then propagate their malicious DNS settings to all users on the local network, and from that point forward intercept any OAuth authentication tokens transmitted by those users. DNS hijacking through router compromise. Image: Microsoft. Because those tokens are typically transmitted only after the user has successfully logged in and gone through multi-factor authentication, the attackers could gain direct access to victim accounts without ever having to phish each user’s credentials and/or one-time codes. “Everyone is looking for some sophisticated malware to drop something on your mobile devices or something,” English said. “These guys didn’t use malware. They did this in an old-school, graybeard way that isn’t really sexy but it gets the job done.” Microsoft refers to the Forest Blizzard activity as using DNS hijacking “to support post-compromise adversary-in-the-middle (AiTM) attacks on Transport Layer Security (TLS) connections against Microsoft Outlook on the web domains.” The software giant said while targeting SOHO devices isn’t a new tactic, this is the first time Microsoft has seen Forest Blizzard using “DNS hijacking at scale to support AiTM of TLS connections after exploiting edge devices.” Black Lotus Labs engineer Danny Adamitis said it will be interesting to see how Forest Blizzard reacts to today’s flurry of attention to their espionage operation, noting that the group immediately switched up its tactics in response to a similar NCSC report (PDF) in August 2025. At the time, Forest Blizzard was using malware to control a far more targeted and smaller group of compromised routers. But Adamitis said the day after the NCSC report, the group quickly ditched the malware approach in favor of mass-altering the DNS settings on thousands of vulnerable routers. “Before the last NCSC report came out they used this capability in very limited instances,” Adamitis told KrebsOnSecurity. “After the report was released they implemented the capability in a more systemic fashion and used it to target everything that was vulnerable.” TP-Link was among the router makers facing a complete ban in the United States. But on March 23, the U.S. Federal Communications Commissio n (FCC) took a much broader approach, announcing it would no longer certify consumer-grade Internet routers that are produced outside of the United States. The FCC warned that foreign-made routers had become an untenable national security threat, and that poorly-secured routers present “a severe cybersecurity risk that could be leveraged to immediately and severely disrupt U.S. critical infrastructure and directly harm U.S. persons.” Experts have countered that few new consumer-grade routers would be available for purchase under this new FCC policy (besides maybe Musk’s Starlink satellite Internet routers, which are produced in Texas). The FCC says router makers can apply for a special “conditional approval” from the Department of War or Department of Homeland Security, and that the new policy does not affect any previously-purchased consumer-grade routers.

0 views
Danny McClelland 1 weeks ago

How I use VeraCrypt to keep my data secure

I’ve been using VeraCrypt for encrypted vaults for a while now. I mount and dismount vaults multiple times a day, and typing out the full command each time gets old fast: , , , , . There’s nothing wrong with the CLI, it’s just repetitive, and repetitive is what aliases are for. The GUI exists, but I spend most of my time in a terminal and launching a GUI app to mount a file feels like leaving the house to check if the back door is locked. So I wrote some aliases and functions. They’ve replaced the GUI for me entirely. Before getting into the aliases: VeraCrypt is the right tool for this specific job, but it’s worth being clear about what that job is. I’m encrypting discrete chunks of data stored as container files, not entire drives. If I wanted to encrypt a USB pen drive or an external hard disk, I’d use LUKS instead, which is better suited to full-device encryption on Linux. VeraCrypt’s strength is the container format: a single encrypted file that you can copy anywhere, sync to cloud storage, and open on almost any platform. I format my vaults as exFAT specifically for this: it works on Windows, macOS, Linux, and iOS via Disk Decipher . That cross-platform use case is what makes it worth the extra ceremony. This post covers what I ended up with and why. It’s worth saying upfront: this works for me, for my use case, right now. It doesn’t follow that it’s the right fit for anyone else. LUKS, Cryptomator , and plenty of other tools solve similar problems in different ways, and any of them might be a better fit depending on what you’re trying to do. I’m not attached to this setup permanently either. If something better comes along, or my requirements change, I’ll adapt. The two simplest aliases are to list what’s currently mounted, and to create new vaults: is a full function because it needs to handle a few things: creating the mount directory, defaulting to the current directory if no path is specified, and (when only one vault is mounted in total) automatically -ing into it so I can get straight to work: The auto-cd only triggers when it’s the sole mounted vault. If I’ve already got other vaults open, it stays out of the way. Both sync clients are paused before mounting to prevent them trying to upload a vault that’s actively being written to — a reliable way to end up with a corrupted or conflicted file. I keep several vault files in the same directory, so was a natural next step: mount all and files in a given directory with a single shared password: The glob qualifier in zsh means the glob returns nothing (rather than erroring) if no files match. Worth knowing if you’re adapting this for bash, where you’d handle the empty case differently. Dismounting is where I hit the most friction. The function handles both single-volume and all-at-once dismounting, and cleans up the mount directories afterwards: The alias just calls with no arguments: dismount everything, clean up the directories. The bit I added most recently is the before dismounting. If I’m working inside a vault and run , the dismount would fail silently because the directory was in use. The fix checks whether is under any of the mounted paths and steps out first. The trailing slash on both sides ( ) avoids the edge case where one vault path is a prefix of another. One more thing that makes this feel native rather than bolted on: tab completion for mounted volumes when running , and completion for / files when using or : One feature worth mentioning, even if I don’t use it daily: VeraCrypt supports hidden volumes . The idea is that you create a second encrypted volume inside the free space of an existing one. The outer volume gets a decoy password and some plausible-looking files. The hidden volume gets a separate password and your actual sensitive data. When VeraCrypt mounts, it tries the password you entered against the standard volume header first, then checks whether it matches the hidden volume header. Because VeraCrypt fills all free space with random data during creation, an observer cannot tell whether a hidden volume exists at all. It’s indistinguishable from random noise. In practice: if you’re ever compelled to hand over your password, you hand over the outer volume’s password. Nothing in the file itself proves there’s anything else there. This is what “plausible deniability” means in this context. It’s not a feature most people will ever need, but it exists and it’s well-implemented. My vault files are stored in Dropbox rather than Proton Drive, which I realise sounds odd given that Proton Drive is the more privacy-focused option. The reason is practical: the Proton Drive iOS app fails to sync VeraCrypt vaults reliably. The developer of Disk Decipher (an iOS VeraCrypt client) recently dug into this and was incredibly helpful in tracking down the cause. Looking at the Proton Drive app logs, he found: . The hypothesis is that VeraCrypt creates revisions faster than Proton Drive’s file provider can handle. What makes it worse is that the problem surfaces immediately: just mounting a vault and dismounting it again is enough to trigger the error. That’s a single write operation. There’s no practical workaround on the iOS side. It’s an annoying trade-off. Dropbox has significantly more access to my files at the infrastructure level, but the vault files themselves are encrypted before they ever leave the machine, so what Dropbox sees is opaque either way. For now, it works. I’m keeping an eye on Proton Drive’s iOS progress. Google Drive is an obvious option I haven’t mentioned: that’s intentional. I’m actively working on reducing my Google dependency, so it’s not something I’m considering here. Technically, on Linux, you could use rsync to swap Dropbox out for almost any provider. What keeps me on Dropbox for this specific use case is how it handles large files: it chunks them and syncs only the changed parts rather than re-uploading the whole thing. For vault files that can be several gigabytes, that matters. As you’ll have noticed in the code above, and both pause Dropbox and Proton Drive before mounting, and restarts them once the last vault is closed. The sync clients fail silently if they’re not running, so the same code works on machines where neither is installed. Since writing this, the picture has got worse. Mounir Idrassi, VeraCrypt’s developer, posted on Sourceforge confirming what’s actually happening: Microsoft terminated the account used to sign VeraCrypt’s Windows drivers and bootloader. No warning, no explanation, and their message explicitly states no appeal is possible. He tried every contact route and reached only chatbots. The signing certificate on existing VeraCrypt builds is from a 2011 CA that expires in June 2026. Once that expires, Windows will refuse to load the driver, and the driver is required for everything: container mounting, portable mode, full disk encryption. The bootloader situation is worse still, sitting outside the OS and requiring firmware trust. The post landed on Hacker News , where Jason Donenfeld, who maintains WireGuard, posted that the same thing has happened to him: account suspended without warning, currently in a 60-day appeals process. His point was direct: if a critical RCE in WireGuard were being actively exploited right now, he’d have no way to push an update. Microsoft would have his hands entirely tied. This isn’t a one-off. A LibreOffice developer was banned under similar circumstances last year. The pattern is open source security tool developers losing distribution rights, without warning, with an appeals process that appears largely decorative. Larger projects may eventually get restored through media pressure. Most won’t have that option. I’m on Linux, so none of this touches me directly. If you’re on Windows and relying on VeraCrypt, “watch it closely” has become genuinely urgent. All of these live in my dotfiles .

0 views
Filippo Valsorda 1 weeks ago

A Cryptography Engineer’s Perspective on Quantum Computing Timelines

My position on the urgency of rolling out quantum-resistant cryptography has changed compared to just a few months ago. You might have heard this privately from me in the past weeks, but it’s time to signal and justify this change of mind publicly. There had been rumors for a while of expected and unexpected progress towards cryptographically-relevant quantum computers, but over the last week we got two public instances of it. First, Google published a paper revising down dramatically the estimated number of logical qubits and gates required to break 256-bit elliptic curves like NIST P-256 and secp256k1, which makes the attack doable in minutes on fast-clock architectures like superconducting qubits. They weirdly 1 frame it around cryptocurrencies and mempools and salvaged goods or something, but the far more important implication are practical WebPKI MitM attacks. Shortly after, a different paper came out from Oratomic showing 256-bit elliptic curves can be broken in as few as 10,000 physical qubits if you have non-local connectivity , like neutral atoms seem to offer, thanks to better error correction. This attack would be slower, but even a single broken key per month can be catastrophic. They have this excellent graph on page 2 ( Babbush et al. is the Google paper, which they presumably had preview access to): Overall, it looks like everything is moving: the hardware is getting better, the algorithms are getting cheaper, the requirements for error correction are getting lower. I’ll be honest, I don’t actually know what all the physics in those papers means. That’s not my job and not my expertise. My job includes risk assessment on behalf of the users that entrusted me with their safety. What I know is what at least some actual experts are telling us. Heather Adkins and Sophie Schmieg are telling us that “quantum frontiers may be closer than they appear” and that 2029 is their deadline. That’s in 33 months, and no one had set such an aggressive timeline until this month. Scott Aaronson tells us that the “clearest warning that [he] can offer in public right now about the urgency of migrating to post-quantum cryptosystems” is a vague parallel with how nuclear fission research stopped happening in public between 1939 and 1940. The timelines presented at RWPQC 2026, just a few weeks ago, were much tighter than a couple years ago, and are already partially obsolete. The joke used to be that quantum computers have been 10 years out for 30 years now. Well, not true anymore, the timelines have started progressing. If you are thinking “well, this could be bad, or it could be nothing!” I need you to recognize how immediately dispositive that is. The bet is not “are you 100% sure a CRQC will exist in 2030?”, the bet is “are you 100% sure a CRQC will NOT exist in 2030?” I simply don’t see how a non-expert can look at what the experts are saying, and decide “I know better, there is in fact < 1% chance.” Remember that you are betting with your users’ lives. 2 Put another way, even if the most likely outcome was no CRQC in our lifetimes, that would be completely irrelevant, because our users don’t want just better-than-even odds 3 of being secure. Sure, papers about an abacus and a dog are funny and can make you look smart and contrarian on forums. But that’s not the job, and those arguments betray a lack of expertise . As Scott Aaronson said : Once you understand quantum fault-tolerance, asking “so when are you going to factor 35 with Shor’s algorithm?” becomes sort of like asking the Manhattan Project physicists in 1943, “so when are you going to produce at least a small nuclear explosion?” The job is not to be skeptical of things we’re not experts in, the job is to mitigate credible threats, and there are credible experts that are telling us about an imminent threat. In summary, it might be that in 10 years the predictions will turn out to be wrong, but at this point they might also be right soon, and that risk is now unacceptable. Concretely, what does this mean? It means we need to ship. Regrettably, we’ve got to roll out what we have. 4 That means large ML-DSA signatures shoved in places designed for small ECDSA signatures, like X.509, with the exception of Merkle Tree Certificates for the WebPKI, which is thankfully far enough along . This is not the article I wanted to write. I’ve had a pending draft for months now explaining we should ship PQ key exchange now, but take the time we still have to adapt protocols to larger signatures, because they were all designed with the assumption that signatures are cheap. That other article is now wrong, alas: we don’t have the time if we need to be finished by 2029 instead of 2035. For key exchange, the migration to ML-KEM is going well enough but: Any non-PQ key exchange should now be considered a potential active compromise, worthy of warning the user like OpenSSH does , because it’s very hard to make sure all secrets transmitted over the connection or encrypted in the file have a shorter shelf life than three years. We need to forget about non-interactive key exchanges (NIKEs) for a while; we only have KEMs (which are only unidirectionally authenticated without interactivity) in the PQ toolkit. It makes no more sense to deploy new schemes that are not post-quantum . I know, pairings were nice. I know, everything PQ is annoyingly large. I know, we had basically just figured out how to do ECDSA over P-256 safely. I know, there might not be practical PQ equivalents for threshold signatures or identity-based encryption. Trust me, I know it stings. But it is what it is. Hybrid classic + post-quantum authentication makes no sense to me anymore and will only slow us down; we should go straight to pure ML-DSA-44. 6 Hybrid key exchange is reasonably easy, with ephemeral keys that don’t even need a type or wire format for the composite private key, and a couple years ago it made sense to take the hedge. Authentication is not like that, and even with draft-ietf-lamps-pq-composite-sigs-15 with its 18 composite key types nearing publication, we’d waste precious time collectively figuring out how to treat these composite keys and how to expose them to users. It’s also been two years since Kyber hybrids and we’ve gained significant confidence in the Module-Lattice schemes. Hybrid signatures cost time and complexity budget, 5 and the only benefit is protection if ML-DSA is classically broken before the CRQCs come , which looks like the wrong tradeoff at this point. In symmetric encryption , we don’t need to do anything, thankfully. There is a common misconception that protection from Grover requires 256-bit keys, but that is based on an exceedingly simplified understanding of the algorithm . A more accurate characterization is that with a circuit depth of 2⁶⁴ logical gates (the approximate number of gates that current classical computing architectures can perform serially in a decade) running Grover on a 128-bit key space would require a circuit size of 2¹⁰⁶. There’s been no progress on this that I am aware of, and indeed there are old proofs that Grover is optimal and its quantum speedup doesn’t parallelize . Unnecessary 256-bit key requirements are harmful when bundled with the actually urgent PQ requirements, because they muddle the interoperability targets and they risk slowing down the rollout of asymmetric PQ cryptography. In my corner of the world, we’ll have to start thinking about what it means for half the cryptography packages in the Go standard library to be suddenly insecure, and how to balance the risk of downgrade attacks and backwards compatibility. It’s the first time in our careers we’ve faced anything like this: SHA-1 to SHA-256 was not nearly this disruptive, 7 and even that took forever with the occasional unexpected downgrade attack. Trusted Execution Environments (TEEs) like Intel SGX and AMD SEV-SNP and in general hardware attestation are just f***d. All their keys and roots are not PQ and I heard of no progress in rolling out PQ ones, which at hardware speeds means we are forced to accept they might not make it, and can’t be relied upon. I had to reassess a whole project because of this, and I will probably downgrade them to barely “defense in depth” in my toolkit. Ecosystems with cryptographic identities (like atproto and, yes, cryptocurrencies) need to start migrating very soon, because if the CRQCs come before they are done , they will have to make extremely hard decisions, picking between letting users be compromised and bricking them. File encryption is especially vulnerable to store-now-decrypt-later attacks, so we’ll probably have to start warning and then erroring out on non-PQ age recipient types soon. It’s unfortunately only been a few months since we even added PQ recipients, in version 1.3.0 . 8 Finally, this week I started teaching a PhD course in cryptography at the University of Bologna, and I’m going to mention RSA, ECDSA, and ECDH only as legacy algorithms, because that’s how those students will encounter them in their careers. I know, it feels weird. But it is what it is. For more willing-or-not PQ migration, follow me on Bluesky at @filippo.abyssdomain.expert or on Mastodon at @[email protected] . Traveling back from an excellent AtmosphereConf 2026 , I saw my first aurora, from the north-facing window of a Boeing 747. My work is made possible by Geomys , an organization of professional Go maintainers, which is funded by Ava Labs , Teleport , Tailscale , and Sentry . Through our retainer contracts they ensure the sustainability and reliability of our open source maintenance work and get a direct line to my expertise and that of the other Geomys maintainers. (Learn more in the Geomys announcement .) Here are a few words from some of them! Teleport — For the past five years, attacks and compromises have been shifting from traditional malware and security breaches to identifying and compromising valid user accounts and credentials with social engineering, credential theft, or phishing. Teleport Identity is designed to eliminate weak access patterns through access monitoring, minimize attack surface with access requests, and purge unused permissions via mandatory access reviews. Ava Labs — We at Ava Labs , maintainer of AvalancheGo (the most widely used client for interacting with the Avalanche Network ), believe the sustainable maintenance and development of open source cryptographic protocols is critical to the broad adoption of blockchain technology. We are proud to support this necessary and impactful work through our ongoing sponsorship of Filippo and his team. The whole paper is a bit goofy: it has a zero-knowledge proof for a quantum circuit that will certainly be rederived and improved upon before the actual hardware to run it on will exist. They seem to believe this is about responsible disclosure, so I assume this is just physicists not being experts in our field in the same way we are not experts in theirs.  ↩ “You” is doing a lot of work in this sentence, but the audience for this post is a bit unusual for me: I’m addressing my colleagues and the decision-makers that gate action on deployment of post-quantum cryptography.  ↩ I had a reviewer object to an attacker probability of success of 1/536,870,912 (0.0000002%, 2⁻²⁹) after 2⁶⁴ work, correctly so, because in cryptography we usually target 2⁻³².  ↩ Why trust the new stuff, though? There are two parts to it: the math and the implementation. The math is also not my job, so I again defer to experts like Sophie Schmieg, who tells us that she is very confident in lattices , and the NSA, who approved ML-KEM and ML-DSA at the Top Secret level for all national security purposes. It is also older than elliptic curve cryptography was when it first got deployed. (“Doesn’t the NSA lie to break our encryption?” No, the NSA has never intentionally jeopardized US national security with a non- NOBUS backdoor, and there is no way for ML-KEM and ML-DSA to hide a NOBUS backdoor .) On the implementation side, I am actually very qualified to have an opinion, having made cryptography implementation and testing my niche. ML-KEM and ML-DSA are a lot easier to implement securely than their classical alternatives, and with the better testing infrastructure we have now I expect to see exceedingly few bugs in their implementations.  ↩ One small exception in that if you already have the ability to convey multiple signatures from multiple public keys in your protocol, it can make sense to to “poor man’s hybrid signatures” by just requiring 2-of-2 signatures from one classical public key and one pure PQ key. Some of the tlog ecosystem might pick this route, but that’s only because the cost is significantly lowered by the existing support for nested n-of-m signing groups.  ↩ Why ML-DSA-44 when we usually use ML-KEM-768 instead of ML-KEM-512? Because ML-KEM-512 is Level 1, while ML-DSA-44 is Level 2, so it already has a bit of margin against minor cryptanalytic improvements.  ↩ Because SHA-256 is a better plug-in replacement for SHA-1, because SHA-1 was a much smaller surface than all of RSA and ECC, and because SHA-1 was not that broken: it still retained preimage resistance and could still be used in HMAC and HKDF.  ↩ The delay was in large part due to my unfortunate decision of blocking on the availability of HPKE hybrid recipients, which blocked on the CFRG, which took almost two years to select a stable label string for X-Wing (January 2024) with ML-KEM (August 2024), despite making precisely no changes to the designs. The IETF should have an internal post-mortem on this, but I doubt we’ll see one.  ↩ Any non-PQ key exchange should now be considered a potential active compromise, worthy of warning the user like OpenSSH does , because it’s very hard to make sure all secrets transmitted over the connection or encrypted in the file have a shorter shelf life than three years. We need to forget about non-interactive key exchanges (NIKEs) for a while; we only have KEMs (which are only unidirectionally authenticated without interactivity) in the PQ toolkit. The whole paper is a bit goofy: it has a zero-knowledge proof for a quantum circuit that will certainly be rederived and improved upon before the actual hardware to run it on will exist. They seem to believe this is about responsible disclosure, so I assume this is just physicists not being experts in our field in the same way we are not experts in theirs.  ↩ “You” is doing a lot of work in this sentence, but the audience for this post is a bit unusual for me: I’m addressing my colleagues and the decision-makers that gate action on deployment of post-quantum cryptography.  ↩ I had a reviewer object to an attacker probability of success of 1/536,870,912 (0.0000002%, 2⁻²⁹) after 2⁶⁴ work, correctly so, because in cryptography we usually target 2⁻³².  ↩ Why trust the new stuff, though? There are two parts to it: the math and the implementation. The math is also not my job, so I again defer to experts like Sophie Schmieg, who tells us that she is very confident in lattices , and the NSA, who approved ML-KEM and ML-DSA at the Top Secret level for all national security purposes. It is also older than elliptic curve cryptography was when it first got deployed. (“Doesn’t the NSA lie to break our encryption?” No, the NSA has never intentionally jeopardized US national security with a non- NOBUS backdoor, and there is no way for ML-KEM and ML-DSA to hide a NOBUS backdoor .) On the implementation side, I am actually very qualified to have an opinion, having made cryptography implementation and testing my niche. ML-KEM and ML-DSA are a lot easier to implement securely than their classical alternatives, and with the better testing infrastructure we have now I expect to see exceedingly few bugs in their implementations.  ↩ One small exception in that if you already have the ability to convey multiple signatures from multiple public keys in your protocol, it can make sense to to “poor man’s hybrid signatures” by just requiring 2-of-2 signatures from one classical public key and one pure PQ key. Some of the tlog ecosystem might pick this route, but that’s only because the cost is significantly lowered by the existing support for nested n-of-m signing groups.  ↩ Why ML-DSA-44 when we usually use ML-KEM-768 instead of ML-KEM-512? Because ML-KEM-512 is Level 1, while ML-DSA-44 is Level 2, so it already has a bit of margin against minor cryptanalytic improvements.  ↩ Because SHA-256 is a better plug-in replacement for SHA-1, because SHA-1 was a much smaller surface than all of RSA and ECC, and because SHA-1 was not that broken: it still retained preimage resistance and could still be used in HMAC and HKDF.  ↩ The delay was in large part due to my unfortunate decision of blocking on the availability of HPKE hybrid recipients, which blocked on the CFRG, which took almost two years to select a stable label string for X-Wing (January 2024) with ML-KEM (August 2024), despite making precisely no changes to the designs. The IETF should have an internal post-mortem on this, but I doubt we’ll see one.  ↩

0 views
devansh 1 weeks ago

On LLMs and Vulnerability Research

I have been meaning to write this for six months. The landscape kept shifting. It has now shifted enough to say something definitive. I work at the intersection of vulnerability triage. I see, every day, how this landscape is changing. These views are personal and do not represent my employer. Take them with appropriate salt. Two things happened in quick succession. Frontier models got dramatically better (Opus 4.6, GPT 5.4). Agentic toolkits (Claude Code, Codex, OpenCode) gave those models hands. The combination produces solid vulnerability research. "LLMs are next-token predictors." This framing was always reductive. It is now actively misleading. The gap between what these models theoretically do (predict the next word) and what they actually do (reason about concurrent thread execution in kernel code to identify use-after-free conditions) has grown too wide for the old frame to hold. Three mechanisms explain why. Implicit structural understanding. Tokenizers know nothing about code. Byte Pair Encoding treats , , and as frequent byte sequences, not syntactic constructs. But the transformer layers above tell a different story. Through training on massive code corpora, attention heads specialise: some track variable identity and provenance, others develop bias toward control flow tokens. The model converges on internal representations that capture semantic properties of code, something functionally equivalent to an abstract syntax tree, built implicitly, never formally. Neural taint analysis. The most security-relevant emergent capability. The model learns associations between sources of untrusted input (user-controlled data, network input, file reads) and dangerous sinks (system calls, SQL queries, memory operations). When it identifies a path from source to sink without adequate sanitisation, it flags a vulnerability. This is not formal taint analysis. No dataflow graph as well. It is a statistical approximation. But it works well for intra-procedural bugs where the source-to-sink path is short, and degrades as distance increases across functions, files, and abstraction layers. Test-time reasoning. The most consequential advance. Standard inference is a single forward pass: reactive, fast, fundamentally limited. Reasoning models (o-series, extended thinking, DeepSeek R1) break this constraint by generating internal reasoning tokens, a scratchpad where the model works through a problem step by step before answering. The model traces execution paths, tracks variable values, evaluates branch conditions. Symbolic execution in natural language. Less precise than formal tools but capable of handling what they choke on: complex pointer arithmetic, dynamic dispatch, deeply nested callbacks. It self-verifies, generating a hypothesis ("the lock isn't held across this path"), then testing it ("wait, is there a lock acquisition I missed?"). It backtracks when reasoning hits dead ends. DeepSeek R1 showed these behaviours emerge from pure reinforcement learning with correctness-based rewards. Nobody taught the model to check its own work. It discovered that verification produces better answers. The model is not generating the most probable next token. It is spending variable compute to solve a specific problem. Three advances compound on each other. Mixture of Experts. Every frontier model now uses MoE. A model might contain 400 billion parameters but activate only 17 billion per token. Vastly more encoded knowledge about code patterns, API behaviours, and vulnerability classes without proportional inference cost. Million-token context. In 2023, analysing a codebase required chunking code into a vector database, retrieving fragments via similarity search, and feeding them to the model. RAG is inherently lossy: code split at arbitrary boundaries, cross-file relationships destroyed, critical context discarded. For vulnerability analysis, where understanding cross-module data flow is the entire point, this information loss is devastating. At one million tokens, you fit an entire mid-size codebase in a single prompt. The model traces user input from an HTTP handler through three middleware layers into a database query builder and spots a sanitisation gap on line 4,200 exploitable via the endpoint on line 890. No chunking. No retrieval. No information loss. Reinforcement-learned reasoning. Earlier models trained purely on next-token prediction. Modern frontier models add an RL phase: generate reasoning chains, reward correctness of the final answer rather than plausibility of text. Over millions of iterations, this shapes reasoning to produce correct analyses rather than plausible-sounding ones. The strategies transfer across domains. A model that learned to verify mathematical reasoning applies the same verification to code. A persistent belief: truly "novel" vulnerability classes exist, bugs so unprecedented that only human genius could discover them. Comforting. Also wrong. Decompose the bugs held up as examples. HTTP request smuggling: the insight that a proxy and backend might disagree about where one request ends and another begins feels like a creative leap. But the actual bug is the intersection of known primitives: ambiguous protocol specification, inconsistent parsing between components, a security-critical assumption about message boundaries. None novel individually. The "novelty" was in combining them. Prototype pollution RCEs in JavaScript frameworks. Exotic until you realise it is dynamic property assignment in a prototype-based language, unsanitised input reaching object modification, and a rendering pipeline evaluating modified objects in a privileged context. Injection, type confusion, privilege boundary crossing. Taxonomy staples for decades. The pattern holds universally. "Novel" vulnerabilities decompose into compositions of known primitives: spec ambiguities, type confusions, missing boundary checks, TOCTOU gaps, trust boundary violations. The novelty is in the composition, not the components. This is precisely what frontier LLMs are increasingly good at. A model that understands protocol ambiguity, inconsistent component behaviour, and security boundary assumptions has all the ingredients to hypothesise a request-smuggling-class vulnerability when pointed at a reverse proxy codebase. It does not need to have seen that exact bug class. It needs to recognise that the conditions for parser disagreement exist and that parser disagreement at a trust boundary has security implications. Compositional reasoning over known primitives. Exactly what test-time reasoning enables. LLMs will not discover the next Spectre tomorrow. Microarchitectural side channels in CPU pipelines are largely absent from code-level training data. But the space of "LLM-inaccessible" vulnerabilities is smaller than the security community assumes, and it shrinks with every model generation. Most of what we call novel vulnerability research is creative recombination within a known search space. That is what these models do best. Effective AI vulnerability research = good scaffolding + adequate tokens. Scaffolding (harness design, prompt engineering, problem framing) is wildly underestimated. Claude Code and Codex are general-purpose coding environments, not optimised for vulnerability research. A purpose-built harness provides threat models, defines trust boundaries, highlights historical vulnerability patterns in the specific technology stack, and constrains search to security-relevant code paths. The operator designing that context determines whether the model spends its reasoning budget wisely or wastes it on dead ends. Two researchers, same model, same codebase, dramatically different results. Token quality beats token quantity. A thousand reasoning tokens on the right code path with the right threat model outperform a million tokens sprayed across a repo with "find vulnerabilities." The search space is effectively infinite. You cannot brute-force it. You narrow it with human intelligence encoded as context, directing machine intelligence toward where bugs actually live. "LLMs are non-deterministic, so you can't trust their findings." Sounds devastating. Almost entirely irrelevant. It confuses the properties of the tool with the properties of the target. The bugs are deterministic. They are in the code. A buffer overflow on line 847 is still there whether the model notices it on attempt one or attempt five. Non-determinism in the search process does not make the search less valid. It makes it more thorough under repetition. Each run samples a different trajectory through the hypothesis space. The union of multiple runs covers more search space than any single run. Conceptually identical to fuzzing. Nobody says "fuzzers are non-deterministic so we can't trust them." You run the fuzzer longer, cover more input space, find more bugs. Same principle. Non-determinism under repetition becomes coverage. In 2023 and 2024, the state of the art was architecture. Multi-agent systems, RAG pipelines, tool integration with SMT solvers and fuzzers and static analysis engines. The best orchestration won. That era is ending. A frontier model ingests a million tokens of code in a single prompt. Your RAG pipeline is not an advantage when the model without RAG sees the whole codebase while your pipeline shows fragments selected by retrieval that does not know what is security-relevant. A reasoning model spends thousands of tokens tracing execution paths and verifying hypotheses. Your external solver integration is not a differentiator when the model approximates what the solver does with contextual understanding the solver lacks. Agentic toolkits handle orchestration better than your custom tooling. The implication the security industry has not fully processed: vulnerability research is being democratised. When finding a memory safety bug in a C library required a Project Zero-calibre researcher with years of experience, the supply was measured in hundreds worldwide. When it requires a well-prompted API call, the supply is effectively unlimited. What replaces architecture as the competitive advantage? Two things. Domain expertise encoded as context. Not "find bugs in this code" but "this is a TLS implementation; here are three classes of timing side-channel that have affected similar implementations; analyse whether the constant-time guarantees hold across these specific code paths." The human provides the insight. The model does the grunt work. Access to compute. Test-time reasoning scales with inference compute. More tokens means deeper analysis, more self-verification, more backtracking. Teams that let a model spend ten minutes on a complex code path will find bugs that teams limited to five-second responses will miss. The end state: vulnerability discovery for known bug classes becomes a commodity, available to anyone with API access and a credit card. The researchers who thrive will focus where the model cannot: novel vulnerability classes, application-level logic flaws, architectural security review, adversarial creativity. This is not a prediction. It is already happening. The pace is set by model capability, which doubles on a timeline measured in months. Beyond next-token prediction Implicit structural understanding Neural taint analysis Test-time reasoning The architecture that enabled this Mixture of Experts Million-token context Reinforcement-learned reasoning The myth of novel vulnerabilities Scaffolding and tokens Non-determinism is a feature Orchestration is no longer your moat

0 views

Germany Doxes “UNKN,” Head of RU Ransomware Gangs REvil, GandCrab

An elusive hacker who went by the handle “ UNKN ” and ran the early Russian ransomware groups GandCrab and REvil now has a name and a face. Authorities in Germany say 31-year-old Russian Daniil Maksimovich Shchukin headed both cybercrime gangs and helped carry out at least 130 acts of computer sabotage and extortion against victims across the country between 2019 and 2021. Shchukin was named as UNKN (a.k.a. UNKNOWN) in an advisory published by the German Federal Criminal Police (the “Bundeskriminalamt” or BKA for short). The BKA said Shchukin and another Russian — 43-year-old Anatoly Sergeevitsch Kravchuk — extorted nearly $2 million euros across two dozen cyberattacks that caused more than 35 million euros in total economic damage. Daniil Maksimovich SHCHUKIN, a.k.a. UNKN, and Anatoly Sergeevitsch Karvchuk, alleged leaders of the GandCrab and REvil ransomware groups. Germany’s BKA said Shchukin acted as the head of one of the largest worldwide operating ransomware groups GandCrab and REvil, which pioneered the practice of double extortion — charging victims once for a key needed to unlock hacked systems, and a separate payment in exchange for a promise not to publish stolen data. Shchukin’s name appeared in a Feb. 2023 filing (PDF) from the U.S. Justice Department seeking the seizure of various cryptocurrency accounts associated with proceeds from the REvil ransomware gang’s activities. The government said the digital wallet tied to Shchukin contained more than $317,000 in ill-gotten cryptocurrency. The Gandcrab ransomware affiliate program first surfaced in January 2018, and paid enterprising hackers huge shares of the profits just for hacking into user accounts at major corporations. The Gandcrab team would then try to expand that access, often siphoning vast amounts of sensitive and internal documents in the process. The malware’s curators shipped five major revisions to the GandCrab code, each corresponding with sneaky new features and bug fixes aimed at thwarting the efforts of computer security firms to stymie the spread of the malware. On May 31, 2019, the GandCrab team announced the group was shutting down after extorting more than $2 billion from victims. “We are a living proof that you can do evil and get off scot-free,” GandCrab’s farewell address famously quipped. “We have proved that one can make a lifetime of money in one year. We have proved that you can become number one by general admission, not in your own conceit.” The REvil ransomware affiliate program materialized around the same as GandCrab’s demise, fronted by a user named UNKNOWN who announced on a Russian cybercrime forum that he’d deposited $1 million in the forum’s escrow to show he meant business. By this time, many cybersecurity experts had concluded REvil was little more than a reorganization of GandCrab. UNKNOWN also gave an interview to Dmitry Smilyanets , a former malicious hacker hired by Recorded Future , wherein UNKNOWN described a rags-to-riches tale unencumbered by ethics and morals. “As a child, I scrounged through the trash heaps and smoked cigarette butts,” UNKNOWN told Recorded Future. “I walked 10 km one way to the school. I wore the same clothes for six months. In my youth, in a communal apartment, I didn’t eat for two or even three days. Now I am a millionaire.” As described in The Ransomware Hunting Team by Renee Dudley and Daniel Golden , UNKNOWN and REvil reinvested significant earnings into improving their success and mirroring practices of legitimate businesses. The authors wrote: “Just as a real-world manufacturer might hire other companies to handle logistics or web design, ransomware developers increasingly outsourced tasks beyond their purview, focusing instead on improving the quality of their ransomware. The higher quality ransomware—which, in many cases, the Hunting Team could not break—resulted in more and higher pay-outs from victims. The monumental payments enabled gangs to reinvest in their enterprises. They hired more specialists, and their success accelerated.” “Criminals raced to join the booming ransomware economy. Underworld ancillary service providers sprouted or pivoted from other criminal work to meet developers’ demand for customized support. Partnering with gangs like GandCrab, ‘cryptor’ providers ensured ransomware could not be detected by standard anti-malware scanners. ‘Initial access brokerages’ specialized in stealing credentials and finding vulnerabilities in target networks, selling that access to ransomware operators and affiliates. Bitcoin “tumblers” offered discounts to gangs that used them as a preferred vendor for laundering ransom payments. Some contractors were open to working with any gang, while others entered exclusive partnerships.” REvil would evolve into a feared “big-game-hunting” machine capable of extracting hefty extortion payments from victims, largely going after organizations with more than $100 million in annual revenues and fat new cyber insurance policies that were known to pay out. Over the July 4, 2021 weekend in the United States, REvil hacked into and extorted Kaseya , a company that handled IT operations for more than 1,500 businesses, nonprofits and government agencies. The FBI would later announce they’d infiltrated the ransomware group’s servers prior to the Kaseya hack but couldn’t tip their hand at the time. REvil never recovered from that core compromise, or from the FBI’s release of a free decryption key for REvil victims who couldn’t or didn’t pay. Shchukin is from Krasnodar, Russia and is thought to reside there, the BKA said. “Based on the investigations so far, it is assumed that the wanted person is abroad, presumably in Russia,” the BKA advised. “Travel behaviour cannot be ruled out.” There is little that connects Shchukin to UNKNOWN’s various accounts on the Russian crime forums. But a review of the Russian crime forums indexed by the cyber intelligence firm Intel 471 shows there is plenty connecting Shchukin to a hacker identity called “ Ger0in ” who operated large botnets and sold “installs” — allowing other cybercriminals to rapidly deploy malware of their choice to thousands of PCs in one go. However, Ger0in was only active between 2010 and 2011, well before UNKNOWN’s appearance as the REvil front man. A review of the mugshots released by the BKA at the image comparison site Pimeyes found a match on this birthday celebration from 2023 , which features a young man named Daniel wearing the same fancy watch as in the BKA photos. Images from Daniil Shchukin’s birthday party celebration in Krasnodar in 2023. Update, April 6, 12:06 p.m. ET : A reader forwarded this English-dubbed audio recording from the a ccc.de (37C3) conference talk in Germany from 2023 that previously outed Shchukin as the REvil leader (Shchuckin is mentioned at around 24:25).

0 views
Kev Quirk 1 weeks ago

Update on the eBay Scam

Last week I wrote about how I thought I was being scammed by someone on eBay . In the post I said the following: I've asked eBay to step in and help resolve the situation, so we will see what happens. But there's a lot of buyer protection on eBay (and rightly so) but there's very little in the way of seller protection, even though I'm not a business. So I have a feeling they will find in favour of the buyer and I'll be out a few quid. Well, a few days after publishing that post, I received an automated email from eBay, saying: I then logged into eBay to check the conversation I'd had with this user via the eBay messenger. At the bottom of the message thread, there was a notice that said: So it seems that eBay, for whatever reason, deemed the user's account to be problematic enough to warrant a suspension/termination. Honestly, I don't know. I haven't had the payment for the watch taken from my account, and eBay haven't requested that I refund the payment. So I assume that I get to keep Ollee watch 1 , and the money the potential scammer originally paid. We message back and forth on WhatsApp, and they haven't messaged me there - if I were in their position and a legit buyer, I'd be seething and would have definitely messaged on WhatsApp. So something tells me this isn't their first rodeo, and the potential loss is just collateral damage. Does this mean that for once the scammers have lost? We'll see. At this point I think the issue is closed from an eBay perspective, so I'm planning to re-list the Ollee Watch for a much discounted price in the next few. If eBay subsequently request the money be returned to the scammy user, I'll just have to take the hit on that. If you're based in the UK and interested in this watch, please get in touch using the reply button below. Albeit now worth way less since it doesn't have the original Casio module, or any of the Ollee packaging.  ↩ Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment . Albeit now worth way less since it doesn't have the original Casio module, or any of the Ollee packaging.  ↩

0 views
Stratechery 1 weeks ago

2026.14: Apple, Acceleration, and AI

Welcome back to This Week in Stratechery! As a reminder, each week, every Friday, we’re sending out this overview of content in the Stratechery bundle; highlighted links are free for everyone . Additionally, you have complete control over what we send to you. If you don’t want to receive This Week in Stratechery emails (there is no podcast), please uncheck the box in your delivery settings . On that note, here were a few of our favorites this week. This week’s Stratechery video is on Agents Over Bubbles . Formula 1 Spins Off Track.  Last fall Eddy Cue and Tim Cook agreed to pay a reported $750 million over the next five years to become the U.S. broadcaster of Formula 1. That deal kicked in this year, and for this week’s Sharp Text I wrote about the disastrous first month of the new era . In short: through no fault of Apple’s, Formula 1 is suddenly an acrimonious mess. Fans are mocking every race, the newly redesigned engines are a problem, and the greatest driver in the sport is threatening to retire at 28 years old. Where will it all go? I have no idea, but the Miami Grand Prix is one month away, and it’s now time for everyone to search for solutions.  — Andrew S harp Apple’s First — and Next — 50 Years . When it comes to discussing Apple’s last 50 years, and their prospects for the next fifty years, there are two obvious choices: John Gruber and Horace Dediu. I have the pleasure of talking to John twice a week on Dithering — we discussed Apple’s anniversary on both Tuesday and Friday — but I was particularly excited to spend 90 minutes with Dediu on this week’s Stratechery Interview . This wasn’t just a podcast about Apple, but about how tech has changed over the last fifty years, and why AI makes even the most reliable narrators of history increasingly uncertain about the future.   — Ben Thompson Security and AI. Glancing at headlines in the aftermath of the Axios hack, I was briefly under the impression that a buzzy D.C. news organization had just suffered a breach of its email list. Unfortunately the real story is a bit more ominous. So if you, like me, had never heard of Axios or a “supply chain attack” before Monday, start with Ben’s Daily Update on Wednesday , which made mechanics of the hack more legible. We went deeper on Sharp Tech to explain why the Axios hack matters , and what sort of tension this week’s news portends — including why AI will make security issue worse in the short-term, but may be the solution in the long run.   — A S Apple’s 50 Years of Integration — Apple has survived 50 years by being the only company integrating hardware and software; if the company loses because of AI it will be because the point of integration changes. Axios Supply Chain Attack, Claude Code Code Leaked, AI and Security — AI is going to be bad for security in the short-term, but much better than humans in the long-term. An Interview with Asymco’s Horace Dediu About Apple at 50 — An interview with Asymco’s Horace Dediu about his career in tech, Apple’s first 50 years, and the prospects for the next 50, particularly in the face of AI A Snap of Oversteer — Formula 1 began a brand new era with a very bad month. Can the sport get back on track? Apple at 50 Will AI Disrupt Apple? The Supercritical CO2 Turbine: Waterless Wonder The U.S., China and Iran; A PRC-Pakistan Peace Plan; KMT Chair Set to Visit China; Huawei, Manus and ZXMOTO Don’t Blame Cayden Boozer, Stretch Run Notes on the Sixers, Celtics, and SGA, Geriatric Millennials Approach Extinction Five Questions on Apple at 50 Years Old, The Axios Hack and AI Security, Q&A on Starlink, AI IPOs, AirPods

0 views
Simon Willison 1 weeks ago

The Axios supply chain attack used individually targeted social engineering

The Axios team have published a full postmortem on the supply chain attack which resulted in a malware dependency going out in a release the other day , and it involved a sophisticated social engineering campaign targeting one of their maintainers directly. Here's Jason Saayman'a description of how that worked : so the attack vector mimics what google has documented here: https://cloud.google.com/blog/topics/threat-intelligence/unc1069-targets-cryptocurrency-ai-social-engineering they tailored this process specifically to me by doing the following: A RAT is a Remote Access Trojan - this was the software which stole the developer's credentials which could then be used to publish the malicious package. That's a very effective scam. I join a lot of meetings where I find myself needing to install Webex or Microsoft Teams or similar at the last moment and the time constraint means I always click "yes" to things as quickly as possible to make sure I don't join late. Every maintainer of open source software used by enough people to be worth taking in this way needs to be familiar with this attack strategy. You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options . they reached out masquerading as the founder of a company they had cloned the companys founders likeness as well as the company itself. they then invited me to a real slack workspace. this workspace was branded to the companies ci and named in a plausible manner. the slack was thought out very well, they had channels where they were sharing linked-in posts, the linked in posts i presume just went to the real companys account but it was super convincing etc. they even had what i presume were fake profiles of the team of the company but also number of other oss maintainers. they scheduled a meeting with me to connect. the meeting was on ms teams. the meeting had what seemed to be a group of people that were involved. the meeting said something on my system was out of date. i installed the missing item as i presumed it was something to do with teams, and this was the RAT. everything was extremely well co-ordinated looked legit and was done in a professional manner.

0 views
iDiallo 1 weeks ago

Zipbombs are not as effective as they used to be

Last year, I wrote about my server setup and how I use zipbombs to mitigate attacks from rogue bots. It was an effective method that help my blog survive for 10 years. I usually hesitate to write these types of articles, especially since it means revealing the inner workings of my own servers. This blog runs on a basic DigitalOcean droplet, a modest setup that can handle the usual traffic spike without breaking a sweat. But lately, things have started to change. My zipbomb strategy doesn't seem to be as effective as it used to be. TLDR; What I learned... and won't tell you Here is the code I shared last year : I deliberately didn't reveal what a function like does in the background. But that wasn't really the secret sauce bots needed to know to avoid my trap. In fact, I mentioned it casually: One more thing, a zip bomb is not foolproof. It can be easily detected and circumvented. You could partially read the content after all. But for unsophisticated bots that are blindly crawling the web disrupting servers, this is a good enough tool for protecting your server. One way to test whether my zipbomb was working was to place an abusive IP address in my blacklist and serve it a bomb. Those bots would typically access hundreds of URLs per second. But the moment they hit my trap, all requests from that IP would cease immediately. They don't wave a white flag or signal that they'll stop the abuse. They simply disappear on my end, and I imagine they crash on theirs. For a lean server like mine, serving 10 MB per request at a rate of a couple per second is manageable. But serving 10 MB per request at a rate of hundreds per second takes a serious toll. Serving large static files had already been a pain through Apache2, which is why I moved static files to a separate nginx server to reduce the load . Now, bots that ingest my bombs, detect them, and continue requesting without ever crashing, have turned my defense into a double-edged sword. Whenever there's an attack, my server becomes unresponsive, requests are dropped, and my monthly bandwidth gets eaten up. Worst of all, I'm left with a database full of spam. Thousands of fake emails in my newsletter and an overwhelmed comment section. After combing through the logs, I found a pattern and fixed the issue. AI-driven bots, or simply bots that do more than scrape or spam, are far more sophisticated than their dumber counterparts. When a request fails, they keep trying. And in doing so, I serve multiple zipbombs, and end up effectively DDoS-ing my own server. Looking at my web server settings: I run 2 instances of Apache, each with a minimum of 25 workers and a maximum of 75. Each worker consumes around 2 MB for a regular request, so I can technically handle 150 concurrent requests before the next one is queued. That's 300 MB of memory on my 1 GB RAM server, which should be plenty. The problem is that Apache is not efficient at serving large files, especially when they pass through a PHP instance. Instead of consuming just 2 MB per worker, serving a 10 MB zipbomb pushes usage to around 1.5 GB of RAM to handle those requests. In the worst case, this sends the server into a panic and triggers an automatic restart. Meaning that during a bot swarm, my server becomes completely unresponsive. And yet, here I am complaining, while you're reading this without experiencing any hiccups. So what did I do? For one, I turned off the zipbomb defense entirely. As for spam, I've found another way to deal with it. I still get the occasional hit when individuals try to game my system manually, but for my broader defense mechanism, I'm keeping my mouth shut. I've learned my lesson. I've spent countless evenings reading through spam and bot patterns to arrive at a solution. I wish I could share it, but I don't want to go back to the drawing board. Until the world collectively arrives at a reliable way to handle LLM-driven bots, my secret stays with me.

0 views
The Jolly Teapot 1 weeks ago

Browsing the web with JavaScript turned off

Some time ago, I tried to use my web browser with JavaScript turned off by default. The experiment didn’t last long , and my attempt at a privacy-protecting, pain-free web experience failed. Too many websites rely on JavaScript, which made this type of web browsing rather uncomfortable. I’ve kept a Safari extension like StopTheScript around, on top of a content blocker like Wipr , just in case I needed to really “trim the fat” of the occasional problematic webpage. * 1 Recently, I’ve given this setup a new chance to shine, and even described it in a post. The results are in: the experiment failed yet again. But I’m not done. Even if this exact setup isn’t the one I currently rely on, JavaScript-blocking is nevertheless still at the heart of my web browsing hygiene on the Mac today. For context, this need for fine-tuning comes from the fact that my dear old MacBook Air from early 2020, rocking an Intel chip, starts to show its age. Sure, it already felt like a 10-year-old computer the moment the M1 MacBook Air chip was released, merely six months after I bought it, but let’s just say that a lot of webpages make this laptop choke. My goal of making this computer last one more year can only be reached if I manage not to throw the laptop through the window every time I want to open more than three tabs. On my Mac, JavaScript is now blocked by default on all pages via StopTheScript. Leaving JavaScript on, meaning giving websites a chance, sort of defeated the purpose of my setup (performance and privacy). Having JS turned off effectively blocks 99% of ads and trackers (I think, don’t quote me on that) and makes browsing the web a very enjoyable experience. The fan barely activates, and everything is as snappy and junk-free as expected. For websites that require JavaScript — meaning frequently visited sites like YouTube or where I need to be logged in like LanguageTool  — I turn off StopTheScript permanently via the Websites > Extensions menu in the Safari Settings. I try to keep this list to a bare minimum, even if this means I have to accept a few annoyances like not having access to embedded video players or comments on some websites. For instance, I visit the Guardian multiple times daily, yet I won’t add it to the exception list, even if I’m a subscriber and therefore not exposed to the numerous “please subscribe” modals. I can no longer hide some categories on the home page, nor watch embedded videos: a small price to pay for a quick and responsive experience, and a minimal list of exceptions. For the few times when I actually need to watch a video on the Guardian, comment on a blog post, or for the occasional site that needs JavaScript simply to appear on my screen (more on that later), what I do is quickly open the URL in a new private window. There, StopTheScript is disabled by default (so that JavaScript is enabled: sorry, I know this is confusing). Having to reopen a page in a different browser window is an annoying process, yes. Even after a few weeks it still feels like a chore, but it seems to be the quickest way on the Mac to get a site to work without having to mess around with permissions and exceptions, which can be even more annoying on Safari. Again, a small price to pay to make this setup work. * 2 Another perk of that private browsing method is that the ephemeral session doesn’t save cookies and the main tracking IDs disappear when I close the window. I think. The problem I had at first was that these sessions tended to display the webpages as intended by the website owners: loaded with JavaScript, ads, modals, banners, trackers, &c. Most of the time, it is a terrible mess. Really, no one should ever experience the general web without any sort of blocker. To solve this weakness of my setup, I switched from Quad9 to Mullvad DNS to block a good chunk of ads and trackers (using the “All” profile ). Now, the private window only allows the functionality part of the JavaScript, a few cookie banners and Google login prompt annoyances, but at least I am not welcomed by privacy-invading and CPU-consuming ads and trackers every time my JS-free attempt fails. I know I could use a regular content blocker instead of a DNS resolver, but keeping it active all the time when JS is turned off feels a bit redundant and too much of an extension overlap. More importantly, I don’t want to be tempted to manage yet another exception list on top of the StopTheScript one (been there, done that, didn’t work). Also, with Safari I don’t think it’s possible to activate an extension in Private Mode only. John Gruber , in a follow-up reaction to The 49MB Web Page article from Shubham Bose, which highlights the disproportionate weight of webpages related to their content, wrote: One of the most controversial opinions I’ve long espoused, and believe today more than ever, is that it was a terrible mistake for web browsers to support JavaScript. Not that they should have picked a different language, but that they supported scripting at all. That decision turned web pages — which were originally intended as documents — into embedded computer programs. There would be no 49 MB web pages without scripting. There would be no surveillance tracking industrial complex. The text on a page is visible. The images and video embedded on a page are visible. You see them. JavaScript is invisible. That makes it seem OK to do things that are not OK at all. Amen to that. But if JavaScript is indeed mostly used for this “invisible” stuff, why are some websites built to use it for the most basic stuff? Video streaming services, online stores, social media platforms, I get it: JavaScript makes sense. But text-based sites? Blogs? Why? The other day I wanted to read this article , and only the website header showed up in my browser. Even Reader Mode didn’t make the article appear. When I opened the link in a private window, where StopTheScript is disabled, lo and behold, the article finally appeared. For some obscure reason, on that website (and others) JavaScript is needed to load text on a freaking web page. Even if you want your website to have a special behaviour regarding loading speeds, design subtleties, or whatever you use JavaScript for, please, use a tag, either to display the article in its most basic form, or at least to show a message saying “JavaScript needed for no apparent reason at all. Sorry.” * 3 This is what I do on my phone, as managing Safari extensions on iOS is a painful process. Quiche Browser is a neat solution and great way for me to have the “turn off JavaScript” menu handy, but without a way to sync bookmarks, history or open tabs with the Mac, I still prefer to stick to Safari, at least for now. ^ I still wish StopTheScript had a one-touch feature to quickly reload a page with JavaScript turned on until the next refresh or for an hour or so, but it doesn’t. ^ This is what I do for this site’s search engine , where PageFind requires JavaScript to operate. Speaking of search engine, DuckDuckGo works fine in HTML-only mode (the only main search engine to offer this I believe). ^ This is what I do on my phone, as managing Safari extensions on iOS is a painful process. Quiche Browser is a neat solution and great way for me to have the “turn off JavaScript” menu handy, but without a way to sync bookmarks, history or open tabs with the Mac, I still prefer to stick to Safari, at least for now. ^ I still wish StopTheScript had a one-touch feature to quickly reload a page with JavaScript turned on until the next refresh or for an hour or so, but it doesn’t. ^ This is what I do for this site’s search engine , where PageFind requires JavaScript to operate. Speaking of search engine, DuckDuckGo works fine in HTML-only mode (the only main search engine to offer this I believe). ^

0 views
André Arko 1 weeks ago

Towards an Amicable Resolution with Ruby Central

Last week, three members of Ruby Central’s board published a new statement about RubyGems and Bundler , and this week they published an incident report on the events last year . The first statement reports that Ruby Central has now completed a third audit of RubyGems.org’s infrastructure: first by the sole remaining RubyGems.org maintainer , the second by Cloud Security Partners , and the third by Hogan Lovells. In all three cases, Ruby Central found no evidence of compromised end user data, accounts, gems, or infrastructure availability . I hope this can conclusively put to rest the idea that I have any remaining access to the RubyGems.org production systems, or that I caused any harm to the RubyGems.org service at any time. I also appreciate that Ruby Central is taking its share of responsibility, recognizing that its lack of communication with the former maintainers (including me) created confusion and frustration that contributed, in part, to how we ended up where we are today. Ruby Central board members Freedom, Brandon, and Ran state that their intent is now to work towards an amicable resolution. I salute their new commitment, and would like to do my part to help the RubyGems community move past these unfortunate events, with a resolution that puts the dispute fully behind us, and allows all of us to move forward. For my part, despite my claims against Ruby Central, and the threats they have directed against me, I am willing to completely settle all of my disputes with them, and pledge to take no legal action against Ruby Central regarding any of their actions prior to today. In exchange, I am requesting two things. First, I am asking Ruby Central to drop their legal threats, including releasing their claims against me and reimbursing my legal costs. Those costs arise from Ruby Central’s actions, including litigation threats, other escalations, and most recently contacting law enforcement. In addition to forcing me to retain counsel, these actions caused considerable stress and disruption. I am willing to provide invoices to ensure the reimbursement precisely matches only my actual costs. Second, I am asking Ruby Central lay our disagreement to rest with a public statement acknowledging that I did no harm to the RubyGems.org service. If Ruby Central fully drops their legal claims, and states I did not harm the RubyGems.org service, I would consider our disagreement amicably settled.

0 views