Posts in Security (20 found)

what i read this week - week 22 2026

Thought that after my post on summary distrust, I could share a list of what I read each week. I technically prefer to process and digest what I read into blog posts, but not everything makes it into one, and this is a way to document and keep them, and maybe give others some food for thought. This is not necessarily stuff I fully agree with, I'm just sharing what ended up in my feed reader or was linked in stuff I read, and doesn't include all the personal blog posts I read. I had a lot to catch up on because I read a lot less the previous 2 weeks. AI detection was built for faces - article about how bad AI detection works for war and climate propaganda videos, as the detection mechanisms often rely on biometric human features, and cannot accurately detect fake fire, smoke effects, etc. US Law Enforcement Warns of ‘Anti-Tech Extremism’ - the US gov is aware of the sentiment around AI and is willing to target and suppress it, and they have little paid accomplice firms too who keep surveilling you on social media and in real life meetings if you organize to oppose data centers or voice criticism about them. Iran Israel AI war propaganda - The AI propaganda we see with armed conflicts right now is a dire warning to the future of online video information. Goes more in-depth about detection methods. Can Tracking Private Jets Predict an Imminent Apocalypse? - article about a site that assumes the rich elites will find out about an apocalypse first and try to flee, therefore serving as a warning system to the rest of us. Why GCC Nations Must Move Beyond Content Moderation to Regulate Harm by Design - GCC means Governments in the Gulf Cooperation Council. Article is about how certain countries have already heavily regulated (and, arguably, censored in their favor) social media platform content, so now they should do the same for platform design. Eh... Big Tech Will Not Save Us From the Climate Crisis - Big Tech is moving away from their climate targets and carbon credit bullshit because they wanna do more AI and data centers. The rest they are doing is unproven or not working. Definition of Overburdened Communities in New Jersey - data centers and other similar detrimental undertakings often target overburdened communities, and this is what it means. A Town Hall Too Late - article documenting how citizens near an almost finished data center actually get informed and treated (not well). They only received information well after the thing started to get built. It is being developed by DataOne for the Nebius Group to support AI infrastructure as part of a $17 billion deal with Microsoft. Meta loses High Court challenge - summary of the case and possible fine. Responsible Innovation Harms Modeling on Microsoft's Learning Platform. EU AI Omnibus Deal Changes - more analysis on the proposed AI Act changes, nudifier ban and more, prominent actors, Merz ruining everything for us as usual, etc. The AI Act is not ready for agents - article for a paper that's also listed below; risks of agents, and a need for more guidance from the AI Office. AI’s real threat is worker control and surveillance - about the divide between workers who use AI and those who are managed by it. Higher paid jobs can be supplemented and accelerated by it, while the less fortunate, less earning (warehouse, gig work) are suffering under AI micromanaging them, causing scheduling issues, errors and more, and are more intensely surveilled than ever by AI "bossware". Entzauberung der Digitalen Souveränität - German; deconstructing the term "digital sovereignty" and ideas around it. Mostly about this talk. AI Forensics gegen Big Tech - German; Interview with AI Forensics founder Marc Faddoul about his work and the fear of retribution, especially the fear about getting targeted by Elon Musk. Human Rights Due Diligence - info on what downstream HRDD is. Microsoft took a step towards human rights - very charitable and exaggerated read of Microsoft parting ways with their Israel chief and their ties to the Israeli Ministry of Defense, plus suspending some of their services. The World Is Already Resisting AI - Article on the AI Resist List , a collaboratively built, publicly accessible database documenting acts of resistance to the AI industry from across the world. AI Data Centers: Big Tech's Impact on Electric Bills, Water, and More - looking at different papers and studies around the water and electricity use of big data centers, where they are located, and what local problems they are worsening. Meta’s Hyperion project in Louisiana will need three times as much electricity as the entire city of New Orleans, and is bigger than its main airport. They also gag local officials with NDA's so they can't properly inform the residents. What you need to know about data centers - information on what Earthjustice attorneys are doing to push for stronger environmental protections targeting data centers. The Web Is Being Made Accessible for AI, Not People - llms.txt convention, MCP etc.; companies are more ready to make their services accessible to AI agents than disabled people. This shouldn't be seen as another curb cut phenomenon. Bitte im Omnibus sitzen bleiben, liebe PIMS - German article about the Art. 88 reworks for Personal Information Management Systems that are supposed to enable an easier handling of cookie consent and tracking. Social Media Verbot weder wissenschaftlich fundiert noch effektiv - German; about how there is no scientific proof that social media bans will help, and some stats about how many people support social media bans, and for what age group. Big Tech und Staat - German article on how the state seems to increasingly serve private interests, especially Big Tech. Bundesregierung will KI Einsatz der Polizei - German article about use of AI software for law enforcement, its risks, and what rights are threatened. Polizeigesetznovelle Schleswig-Holstein - German article discussing Schleswig-Holsteins attempt at changing their police law, including real-time facial recognition, behavioral surveillance, online face search and more, from strangers on the street, and even mere victims or witnesses of crimes. Das Internet verrottet - German; about link rot and archiving things properly. Why “Made in Europe” Won’t Fix AI’s Deeper Problems - fitting to my blog post. Big Tech as Executor of the dead - was also a topic at the conference. Praxisfolgen Russmedia Urteil - consequences for social media platforms following the Russmedia court decision C-492/23; Notice-And-Sweep. AI Act: deal on simplification measures, ban on “nudifier” apps - concluding what deal was reached between co-legislators; names the new deadlines for AI compliance. Ratepayer Protection Pledge by the White House - promises and propaganda Microslop's Community-First AI Infrastructure Pledge - promises and propaganda vol. 2 Anthropic's Promises - promises and propaganda vol. 3 Offener Brief der Industrie - Open letter to German politicians by German industry criticizing parts of the digital omnibus; it was silly to read, and I think it is disrespectful to imply that technologies can be discriminated against; that's a different usage and connotation than just using it as "being discriminated from" (aka being differentiated from others). None of the arguments are convincing. Draft guidelines for the implementation of transparency obligations for certain AI systems under Art. 50 AI Act - this is out for commenting until the 3rd of June, by the way. Consent Fatigue entgegenwirken - German policy brief by the TUM think tank about countering consent fatigue. Data Center Fight Guide Einstellungen zum geplanten Einsatz von Palantir-Software II - German phone survey about Palantir use by Verian & campact from Sep 2025. Grok Unleashed - Analyzing Grok nudify uses and extremist propaganda, by AI Forensics. Distinguishing Authentic from AI-Generated Explosions using Spatiotemporal Dynamics - more about how to authenticate conflict-zone explosion footage. AI footage tends to produce much bigger, rounder mushroom plumes that expand quicker. Don't ask me about the math, I don't understand any of that, but I found the rest I could understand very interesting. Embedding Human Rights in Technical Standards - About WITNESS' experience in the Coalition for Content Provenance and Authenticity (C2PA), which is in favor of open technical standards to embed verifiable provenance metadata into digital media files. Helpful explainer here . Better Images of AI - a guide for creators and users on how to use accurate images when talking about AI and what to avoid, as it shapes the narrative. Specifically, they call to avoid the color blue, descending code, human brains, science fiction elements, white robots, anthromorphism and references to the Creation of Adam. That is because it misrepresents capabilities, risks and fears, and who is or can work in or with AI (often, only white men are shown). The AI Climate Hoax : Behind the Curtain of How Big Tech Greenwashes Impacts - talks about how different kinds of AI and its uses as well as carbon credits and overstating the climate benefits of AI can be used to hide the environmental impact of the big, hyped up GenAI. Big Tech’s ‘False Solutions’ to the Climate Crisis - similar thing here. Debunking nuclear power, carbon capture, and artificial intelligence as helping climate change. There are endnotes at each chapter, so don't miss what's after. Tackling Arbitrary Digital Surveillance in the Americas - uses Cajar vs. Colombia for some examples to showcase what needs to change, and the importance of the three-step-analysis. Basically all of this is standard here in the EU, but still needs to be implemented there. TRIED AI Detection Benchmark - paper from WITNESS about their framework that evaluates AI detection tools through a sociotechnical lens (with a focus on adaptability, transparency, accessibility, contextual relevance, and fairness). Wasn't a complete fan, because a chunk of it (for example about resource investments) is rather vague, theoretical and hardly connected with a direct or objective way to measure in practice. The rest is mostly fair, but also rather obvious, and some of it is basically impossible to combine in practice - like only using datasets that comply with data protection and intellectual property laws and are "ethical" with no sensitive data, while the models are supposed to reliably detect an AI generated video of a minority language or niche culture, or have enough datasets (= lots) to accurately detect cultural and local contexts. I can't quite pinpoint what exactly bothers me about it otherwise. I did like the examples of real use cases where things failed. In total, that is roughly ~ 340 pages, if we count an article as two pages on average. Most of it was read on Sunday and Monday (holiday), as I had a lot of free time then. Reply via email Published 30 May, 2026 AI detection was built for faces - article about how bad AI detection works for war and climate propaganda videos, as the detection mechanisms often rely on biometric human features, and cannot accurately detect fake fire, smoke effects, etc. US Law Enforcement Warns of ‘Anti-Tech Extremism’ - the US gov is aware of the sentiment around AI and is willing to target and suppress it, and they have little paid accomplice firms too who keep surveilling you on social media and in real life meetings if you organize to oppose data centers or voice criticism about them. Iran Israel AI war propaganda - The AI propaganda we see with armed conflicts right now is a dire warning to the future of online video information. Goes more in-depth about detection methods. Can Tracking Private Jets Predict an Imminent Apocalypse? - article about a site that assumes the rich elites will find out about an apocalypse first and try to flee, therefore serving as a warning system to the rest of us. Why GCC Nations Must Move Beyond Content Moderation to Regulate Harm by Design - GCC means Governments in the Gulf Cooperation Council. Article is about how certain countries have already heavily regulated (and, arguably, censored in their favor) social media platform content, so now they should do the same for platform design. Eh... Big Tech Will Not Save Us From the Climate Crisis - Big Tech is moving away from their climate targets and carbon credit bullshit because they wanna do more AI and data centers. The rest they are doing is unproven or not working. Definition of Overburdened Communities in New Jersey - data centers and other similar detrimental undertakings often target overburdened communities, and this is what it means. A Town Hall Too Late - article documenting how citizens near an almost finished data center actually get informed and treated (not well). They only received information well after the thing started to get built. It is being developed by DataOne for the Nebius Group to support AI infrastructure as part of a $17 billion deal with Microsoft. Meta loses High Court challenge - summary of the case and possible fine. Responsible Innovation Harms Modeling on Microsoft's Learning Platform. EU AI Omnibus Deal Changes - more analysis on the proposed AI Act changes, nudifier ban and more, prominent actors, Merz ruining everything for us as usual, etc. The AI Act is not ready for agents - article for a paper that's also listed below; risks of agents, and a need for more guidance from the AI Office. AI’s real threat is worker control and surveillance - about the divide between workers who use AI and those who are managed by it. Higher paid jobs can be supplemented and accelerated by it, while the less fortunate, less earning (warehouse, gig work) are suffering under AI micromanaging them, causing scheduling issues, errors and more, and are more intensely surveilled than ever by AI "bossware". Entzauberung der Digitalen Souveränität - German; deconstructing the term "digital sovereignty" and ideas around it. Mostly about this talk. AI Forensics gegen Big Tech - German; Interview with AI Forensics founder Marc Faddoul about his work and the fear of retribution, especially the fear about getting targeted by Elon Musk. Human Rights Due Diligence - info on what downstream HRDD is. Microsoft took a step towards human rights - very charitable and exaggerated read of Microsoft parting ways with their Israel chief and their ties to the Israeli Ministry of Defense, plus suspending some of their services. The World Is Already Resisting AI - Article on the AI Resist List , a collaboratively built, publicly accessible database documenting acts of resistance to the AI industry from across the world. AI Data Centers: Big Tech's Impact on Electric Bills, Water, and More - looking at different papers and studies around the water and electricity use of big data centers, where they are located, and what local problems they are worsening. Meta’s Hyperion project in Louisiana will need three times as much electricity as the entire city of New Orleans, and is bigger than its main airport. They also gag local officials with NDA's so they can't properly inform the residents. What you need to know about data centers - information on what Earthjustice attorneys are doing to push for stronger environmental protections targeting data centers. The Web Is Being Made Accessible for AI, Not People - llms.txt convention, MCP etc.; companies are more ready to make their services accessible to AI agents than disabled people. This shouldn't be seen as another curb cut phenomenon. Bitte im Omnibus sitzen bleiben, liebe PIMS - German article about the Art. 88 reworks for Personal Information Management Systems that are supposed to enable an easier handling of cookie consent and tracking. Social Media Verbot weder wissenschaftlich fundiert noch effektiv - German; about how there is no scientific proof that social media bans will help, and some stats about how many people support social media bans, and for what age group. Big Tech und Staat - German article on how the state seems to increasingly serve private interests, especially Big Tech. Bundesregierung will KI Einsatz der Polizei - German article about use of AI software for law enforcement, its risks, and what rights are threatened. Polizeigesetznovelle Schleswig-Holstein - German article discussing Schleswig-Holsteins attempt at changing their police law, including real-time facial recognition, behavioral surveillance, online face search and more, from strangers on the street, and even mere victims or witnesses of crimes. Das Internet verrottet - German; about link rot and archiving things properly. Why “Made in Europe” Won’t Fix AI’s Deeper Problems - fitting to my blog post. Big Tech as Executor of the dead - was also a topic at the conference. Praxisfolgen Russmedia Urteil - consequences for social media platforms following the Russmedia court decision C-492/23; Notice-And-Sweep. AI Act: deal on simplification measures, ban on “nudifier” apps - concluding what deal was reached between co-legislators; names the new deadlines for AI compliance. Ratepayer Protection Pledge by the White House - promises and propaganda Microslop's Community-First AI Infrastructure Pledge - promises and propaganda vol. 2 Anthropic's Promises - promises and propaganda vol. 3 Offener Brief der Industrie - Open letter to German politicians by German industry criticizing parts of the digital omnibus; it was silly to read, and I think it is disrespectful to imply that technologies can be discriminated against; that's a different usage and connotation than just using it as "being discriminated from" (aka being differentiated from others). None of the arguments are convincing. Draft guidelines for the implementation of transparency obligations for certain AI systems under Art. 50 AI Act - this is out for commenting until the 3rd of June, by the way. Consent Fatigue entgegenwirken - German policy brief by the TUM think tank about countering consent fatigue. Data Center Fight Guide Einstellungen zum geplanten Einsatz von Palantir-Software II - German phone survey about Palantir use by Verian & campact from Sep 2025. Grok Unleashed - Analyzing Grok nudify uses and extremist propaganda, by AI Forensics. Distinguishing Authentic from AI-Generated Explosions using Spatiotemporal Dynamics - more about how to authenticate conflict-zone explosion footage. AI footage tends to produce much bigger, rounder mushroom plumes that expand quicker. Don't ask me about the math, I don't understand any of that, but I found the rest I could understand very interesting. Embedding Human Rights in Technical Standards - About WITNESS' experience in the Coalition for Content Provenance and Authenticity (C2PA), which is in favor of open technical standards to embed verifiable provenance metadata into digital media files. Helpful explainer here . Better Images of AI - a guide for creators and users on how to use accurate images when talking about AI and what to avoid, as it shapes the narrative. Specifically, they call to avoid the color blue, descending code, human brains, science fiction elements, white robots, anthromorphism and references to the Creation of Adam. That is because it misrepresents capabilities, risks and fears, and who is or can work in or with AI (often, only white men are shown). The AI Climate Hoax : Behind the Curtain of How Big Tech Greenwashes Impacts - talks about how different kinds of AI and its uses as well as carbon credits and overstating the climate benefits of AI can be used to hide the environmental impact of the big, hyped up GenAI. Big Tech’s ‘False Solutions’ to the Climate Crisis - similar thing here. Debunking nuclear power, carbon capture, and artificial intelligence as helping climate change. There are endnotes at each chapter, so don't miss what's after. Tackling Arbitrary Digital Surveillance in the Americas - uses Cajar vs. Colombia for some examples to showcase what needs to change, and the importance of the three-step-analysis. Basically all of this is standard here in the EU, but still needs to be implemented there. TRIED AI Detection Benchmark - paper from WITNESS about their framework that evaluates AI detection tools through a sociotechnical lens (with a focus on adaptability, transparency, accessibility, contextual relevance, and fairness). Wasn't a complete fan, because a chunk of it (for example about resource investments) is rather vague, theoretical and hardly connected with a direct or objective way to measure in practice. The rest is mostly fair, but also rather obvious, and some of it is basically impossible to combine in practice - like only using datasets that comply with data protection and intellectual property laws and are "ethical" with no sensitive data, while the models are supposed to reliably detect an AI generated video of a minority language or niche culture, or have enough datasets (= lots) to accurately detect cultural and local contexts. I can't quite pinpoint what exactly bothers me about it otherwise. I did like the examples of real use cases where things failed. Zugänglichkeit von De-Personalisierungsoptionen und Meldeverfahren auf sehr großen Online-Plattformen Decisions I had to read to translate for noyb: 2025-0.875.804 and W171 2305420-1

0 views

Let’s talk about encrypted reasoning

This is a quick post I wanted to write about a hobby project I spent a weekend on. It has little to do with real cryptography, and mostly doesn’t expose a particularly exciting vulnerability. But it did teach me a lot about frontier LLM APIs and coding agents. It also got me certified as an OpenAI “cyber researcher” which is something that doesn’t happen every day. In any case, please keep your expectations low. Who knows, perhaps someone else will find something exciting to do with this. Last week I decided it’d be fun to set up an OpenClaw agent. I still don’t know why I did this. I have no use for another AI in my life, and I realized this fact almost immediately after I got through the (surprisingly difficult!) configuration process. But configuring the agent to talk to Claude exposed me to something way more interesting: I got a cool error . The kind of error that cryptographers can’t resist: This intrigued me. What in the world was a signature doing in an LLM’s “thinking” block? Why would thinking blocks be signed in the first place? And if the thinking blocks are signed, then that means tampering with thinking blocks must have security implications. And there went my weekend. After twenty hours and about 5 million Codex tokens, I wasn’t much smarter. But I had learned a few things. First, the basics. You probably know that most LLM providers expose an API so you can write apps that talk to the model. For Claude, this is called the Messages API, while OpenAI calls it Responses . These APIs handle the ordinary tasks you’d expect an application to need from an LLM. They (1) allow you to set an application-level “instructions” (or ‘developer’) prompt for your application. They let you (2) provide ordinary textual prompts, and get back responses from the LLM. They also (3) provide bookkeeping, for example, listing the number of tokens you’ve used. For reasoning LLMs, they also do something I did not previously know about, and this is central to the error message above. They also send you the contents of the model’s hidden “ reasoning ” or “ thinking ” fields. Note that this data is not the stuff you see on ChatGPT when you ask it a question: those strings are merely summaries . The model’s actual reasoning (called “chain-of-thought”, CoT) is normally kept private and held back by the server. However, the APIs work differently: for various reasons (which we’ll get into below), an encrypted copy of the raw CoT reasoning data is actually sent down to the application. If you’re like me, you should now have three questions: how , why , and so what ? The how is the easiest to answer: for both providers, “thinking”/”reasoning” are sent down to the client as JSON. Each contains a blob of Base64-encoded stuff. The API documentation informs us that this data contains opaque reasoning, and that you’re not meant to look at it; you’re just supposed to ship it back to the server on the next turn. Let’s break that rule. The content of the blocks varies slightly between providers, but the core of each is a random-looking string that appears to be an authenticated ciphertext. You don’t need to be Sherlock Holmes to deduce this. First, it grows and shrinks depending on how hard the model thinks. And second, tampering with any of the ciphertext-looking data produces a recognizable API error when you send it back in. Thanks to AI, I can make nice diagrams. Here’s what OpenAI’s reasoning blocks look like: And here’s Anthropic’s wildly overcomplicated equivalent: The why part of this is more involved. Why ship this data to the client? Doesn’t the provider already have your reasoning data? The answer is sort of . Although the server has access to reasoning state while producing a response, API conversations are not always implemented as persistent sessions. In stateless, zero-retention , tool-loop, or client-managed conversation modes, the client application is expected to carry the transcript forward. Encrypted reasoning lets the provider return hidden model state to the client in a form the client can’t read or modify, but can later replay so the provider can verify/decrypt it and continue a reasoning process. This brings us to the $10 question. We have opaque, encrypted blobs. Should we care about them? Initially the answer seems to be no : this data is unreadable, and tampering with any bit of it produces an angry rejection message from the server. So on the one hand, it seems like this data is really unavailable to us. On the other hand: model reasoning is a big deal! These strings are the literal internal monologue of the model. They might influence the way the model processes later data we send it. More practically: when someone goes to this much trouble to cryptographically protect something, my experience is that they usually have a good reason. And I think the providers do have a good reason. A hint comes from this OpenAI post from 2024, which introduced the first “o1” reasoning model: In other words: it’s possible that these blobs contain sensitive information that the model otherwise wouldn’t share with us. That makes them really tempting to mess with. Unfortunately, the cryptography mostly seems to protect them. Although we can look at the blocks, none of the fields they contain seem readable or malleable. Believe me, I tried. But that doesn’t mean we should quit, it just means we need to try other things. There are still two directions worth checking: Thanks to the magic of coding agents, I was able to test every permutation of these concerns. I won’t claim to you the results are dramatic; nobody is going to win huge bug bounties on them (I tried). But the general answer for both cases seems to be: yes, these possibilities are both real . As I mentioned above, any attempt to directly tamper with reasoning/thinking blocks always produces an error from the API endpoint. However, this only applies to tampering. A few experiments reveal that we can replay an unmodified older reasoning blocks, with no visible error at all. Not only can we replay within sessions, this same idea also seems to work across different sessions. It even applies to sessions running in different accounts . That is: when we obtain reasoning blobs from a session running under one OpenAI or Anthropic account, we can replay them against a session in a different account altogether. For OpenAI specifically, we can even replay blobs across different models. (The Claudes got fussy about this.) At a cryptographic level, this tells us something very simple: the providers are probably using a single global key to encrypt and authenticate all reasoning data sent to the client. This might matter if you’re using the providers’ zero-data retention mode, since it means that everyone’s reasoning data is escrowed under one (not frequently changing) key, rather than protected per-account. The use of a global key also raises a possible new threat model. If you’re an application that uses an API to expose a “chat” interface to malicious parties, you need to be careful that they can’t inject JSON into your chat stream. If they can, a bad guy might inject their own JSON-formatted reasoning blobs into the conversation. This could cause the model to behave in unpredictable ways. So sanitize your chat inputs! Of course, just because the LLM providers accept replayed blocks doesn’t mean much. It strongly indicates that decryption was successful, but not that the model actually saw or cogitated over the decrypted data. To use GPT 5.5’s favored language, the replayed blobs may be accepted but not semantically active. To answer this question, I ran a lot of experiments using Codex. (So many that at one point Codex literally forced me to stop and visit an OpenAI cyber trusted access website where I had to enter pictures of my driver’s license in order to keep going.) What I learned for my trouble is that the nature of block processing between models is wildly variable. Most of the time, replays of encrypted blocks just get quietly absorbed by the model. But every now and then, the model will output something to demonstrate that it is obviously is reading what those blocks contain. For example, here’s GPT 5.5: So this proves that encrypted blocks are, indeed, semantically active. But it doesn’t actually prove that we can do much with them. And believe me, I tried. This was mostly a disappointing project. I tried to convince the model to think about really, really sensitive secrets, while also trying to convince another session that it wanted to dump the same data as cooperatively as possible. What I came away with was some evidence that the data was being placed into the encrypted blocks if I asked the model to think about it. But if I also instructed the model to not output the data to the user , it mostly held to that instruction — even when I replayed the blocks to new sessions. I remain convinced that all kinds of sensitive data can be written in there if you ask the model to think about it, and that there’s a secret incantation that I could try to get the models to produce it. But I’m not able to prove it. Part of the reason I’m writing this post is to scrape it off my plate so someone else can try. I won’t try to convince you that this is a world-beating security result. In fact, all I’m really showing you is that “stuff I can make the model say in plaintext night also get encrypted.” But if that data can include platform secrets , that might get more interesting. More on that later. So while replaying reasoning blocks doesn’t seem to give us what we want, this is not the only way to extract secrets. A second question is whether we can use metadata related to the reasoning blocks to actually learn things that the model isn’t supposed to tell us. While we can’t directly read reasoning blocks, we can learn something about them: we can see how long they are. We can also observe related signals like “how many tokens did the model write”. OpenAI even gives us a special field called . If we’re a user consuming chat data without direct access to the API, we might even be able to measure the raw time it takes the model to respond. An obvious question is: given these signals, can we use them as a kind of side channel to extract secret data? Here’s an example. Imagine that a model’s application prompt (“instructions”) contains a secret, along with strict instructions that it must never tell the user this secret directly . This secret could be a single 0/1 bit, or a byte, or a longer string. We can verify that the model respects these instructions, and won’t output the data visibly — no matter how nicely we ask it. (Note: I’m not a jailbreak expert; maybe this guy will have better luck!) Now consider the following experiment: In all cases, the visible output will be the same: the model is not violating instructions. But note that within reasoning blocks the model is allowed to think about the secret bit, since those blocks are hidden. Since the complexity of computation A is shorter than that of computation B , one value of the bit will produce a lot less reasoning than the other. This will appear in various places: the size of the encrypted thinking blocks, the token counts, and even in wall-clock response times. The trick now is simply to calibrate the system and classify these responses based on whether reasoning blobs were “short” or “long”, which tells us whether the bit was 0 or 1. I put together an absurd test where the model has to compute a long checksum when the bit is 1. The results look something like this: Of course, an attacker who has access to a chat interface might not have access to the encrypted blob. So they might have to get this data through some other mechanism. You can get a very similar signal just by measuring how long it takes the model to return a response. So the summary here is not so much “encrypted blobs can leak useful information” although sometimes they do . It’s that reasoning itself can be leaky, even when we beg the model not to leak. Simply doing it, in a way that reasons over secret data, can potentially leak useful information to a clever attacker. Once I found this side channel I got really excited. Sure, it’s slow: but maybe we could use it to slowly chisel out the models’ top secret instruction prompts, like the one that says “ don’t talk about Goblins. ” This would be painful but simple: just ask true/false questions about the first letter, then the second letter, and so on. At this point I had to stop using Codex and Claude Code because they both just plain refused to help me extract confidential information, even after checking my ID and taking lock of my hair. I was forced to switch to OpenCode using Kimi 2.6, which had no ethical qualms about laying down a trail of destruction for my security research. Unfortunately, most of the destruction was my own. I won’t go into the nightmare of model hallucinations that followed. I’ll just say that I learned a few things: So TL;DR, while I was able to extract application-specific secrets that did exist, I wasn’t able to extract model prompts that don’t. Moreover, I didn’t feel quite ambitious enough to begin pounding on ChatGPT or Claude’s public web interface (where they certainly do.) So for the moment I’m just going to call this a maybe . I think model providers should think hard about this reasoning data, and they should make sure it doesn’t leak things they don’t want it to. I reported both results to OpenAI and Anthropic via their bug bounty programs. OpenAI said my report was unreproducible. I sent them my scripts, but too late. Anthropic quite reasonably told me they don’t see any security implications in side channels or replays, but they might alter their developer documentation to warn application developers to be more careful. I think that’s a fine decision (except for the part about trusting application developers), even if I want to believe there could be more here. Either way: I took those responses as permission to write this post. I still don’t think model providers should write this stuff off entirely. As far as what model providers can do, there’s the easy stuff and the hard stuff. First: both providers should proactively improve their key management . If you think reasoning state is worth encrypting, then properly encrypt it. It should not be replayable across sessions or accounts. While I can’t tell you exactly what bad things might happen, I think you’re better off patching holes before you see the water coming through them. The side channel results aren’t fixed by patches to the encryption protocol. They’re more fundamental to the way models work: if I can convince a model to do secret-dependent reasoning, then there is almost certain to be leakage. If someone figures out how to exploit this for some meaningful purpose, the best I can offer is that models will need to apply policy gates before they even reason about things. Unfortunately, this seems like it might have some real downsides, because “apply policy gate” itself often requires reasoning. This stuff makes me grateful I’m just a cryptographer and I don’t have to think about this sort of problem. Replays . Can we replay encrypted blobs back in the wrong order or even in the wrong session (worse: a whole different account ), and will the model accept them as valid reasoning that it made? Side channels . While we can’t see what’s in the encrypted blobs, we can learn some metadata about them For example: we can see how long they are. These side channels don’t need to involve the cryptography itself: we might also learn how many tokens the model spent making them, or time how long it took to produce them. A malicious user asks the model to reason about the secret bit (or one specific bit of a longer secret.) If the bit is 0, perform simple computation A . If it’s 1, perform extremely complex computation B . While the two computations are both very different, we can ensure that their visible output reveals nothing about the secret. So the model is not revealing its instructions if it follows this request. Neither GPT 5x nor Claude actually has a system prompt when you’re using API mode. But they’re both happy to tell you they have one! Moreover, they will happily invent plausible ones if you really push them to. Kimi 2.6 is also happy to tell you you’re a genius who just invented the Internet each time this happens. Inevitably your experimental results will turn out to have been totally bogus, but at least Kimi will be very disappointed on your behalf. With all that said, Kimi is shockingly good at coding and experiment design, especially given the very attractive pricing. If I was an Anthropic or OpenAI investor, I’d be scared.

0 views
Martin Fowler 3 days ago

The VibeSec Reckoning

Vibe coding has significantly accelerated software prototyping but AI agents frequently recommend insecure configurations, creating security problems. Gautam Koul, Lucian Moss, Neil Drew-Lopez, and Daberechi Ruth Edeokoh share their experience while building applications for Thoughtworks's global marketing. They learned that to combat this we need to write a security context file to guide the AI, be cautious with AI permission requests, create a daily security intelligence feed, and provide builders with a secure-by-default harness and templates.

0 views
Nicky Reinert 4 days ago

Digital Dilemma: Why Google Accounts Should Be Treated as Critical Infrastructure

Google and Microsoft accounts have become part of digital public life. Why account suspensions are more than just an email problem.

0 views
daniel.haxx.se 5 days ago

The pressure

I’m doing Open Source primarily because I love it. The social aspects, the for-the-good angle and for the challenge of engineering this to work for everyone. I also do it because it is my full-time job and getting food on the table and provide for my family is not unimportant. It may come as a shock, but I am not in this game for the money or the extravagant life style. I have been working full-time on curl since 2019. For me, this typically means doing 50 hour work weeks, as I spend all days on it and then I top them off with a few more hours every late night – all days of the week, I spend all this time on curl because it is a work of love and it is both my job and my spare time hobby and no one counts my hours anyway. (And no, I do not recommend anyone else to do the same. I’m not suggesting this for others.) I consider my primary work-related mission in life to be to make curl the best transfer library and tool possible and make it qualify as a top project in Open Source, quality, performance and not the least, security. I believe we generally meet these lofty goals. I founded the curl project, I am still a lead developer in the project almost thirty years later. While I always clearly state that curl is not a one-man shop and that curl would absolutely not be what it is without my awesome curl team mates, a large part of the world still thinks of curl as my project and sometimes more or less equals curl with my person. I cannot help to take curl issues personally. When someone critiques curl, it is by extension a complaint on decisions and choices I stand by and behind – and many cases I made the calls. curl is personal to me. curl has formed my life forever. I have two kids. They were both born many years after I started working on curl and they are both adults and independent individuals now. I love them dearly. Life passes by but curl remains. We’ve had slow times and busy times. The decades pass. Later this year the curl project celebrates thirty years. We typically repeat that the number of curl installations in the world is perhaps thirty billion. Over the last years I have done numerous blog posts on the state of security reports submitted to curl. They have gradually switched over from complaints on stupid LLMs , to stupid AI slop reports , closing the bug bounty over to the current high quality chaos which for us started maybe at some point in March 2026. We have seen many spectacular security failures through the years, in Internet products, in software infrastructure and in Open Source. Every time we read about those events, we get reminded about how curl is everywhere and how we really really really do not want anything such to happen to us or our users. And we take another lap around the project, tighten every bolt a little more, add a few more checks, tests and guidelines to ideally make the curl ship ever so slightly less likely to ever leak or sink. Recently, after I pointed out that Mythos only found a single low severity problem in curl in its first scan, countless people have repeated the claim that curl is one of the most scrutinized, most reviewed, most fuzzed and most verified source codes you can imagine. Perhaps that’s true, but I just want to mention this: that’s not by mistake. That’s not an accident or a happy circumstance. That’s the result of relentless work and attention to details through decades. Software engineering done right . Iterative improvements over time that simply never ends is an effective method. This does not however mean that we don’t have bugs or that we don’t have security problems left, because we do. We have hundreds of thousands of lines of source code that is doing highly parallel networking for many protocols on all imaginable operating systems and CPU architectures – in C. So we fix the problems, patch them up and ship new releases. Over and over. Thirty billion installations world-wide means that everyone reading this blog post has curl installed multiple times in stuff they own. In phones, tablets, cars, TVs, printers, game consoles, kitchen equipment and more. Not to mention all the online digital services we use and those devices communicate with. I cannot stress the importance of curl security and I would guess that most of you agree with me. I am jealous of those projects that shipped a horrible bug at some point in the past that made the world burn for a while. They got attention and some of them then got funding and financial muscles to get them staff and hire multiple full time engineers. I sometimes think we would be better off if we also had one of those. A thirty years old project could make you think you’ve seen most things already, but we have not been in this situation before. The rate of incoming security reports is 4-5 times higher than it was in 2024 and double the speed of 2025 – meaning that on average we now get more than one report per day . The quality is way higher than ever before. The reports are typically very detailed and long. In order to manage this incoming flood of submissions, we need to make sure to handle them as soon as possible as we know there are more coming. If we don’t take care of them roughly at the same speed they arrive, the backlog just grows and having that list of potential security problems in a list that you don’t have control over takes a mental toll. I spend almost all my days right now working through the list of reported security issues that we have on Hackerone. Verify the claim, assess the importance, write a patch, figure out when the bug was introduced, understand the vulnerability, write a detailed advisory explaining the problem to the world and communicate all this with the security researcher and the rest of the curl security team. For the first time in my life, my wife voiced concerns about my work hours and my imbalanced work/life situation. I work more than I’ve done before, but the flood keeps coming. People in my surrounding, I guess reading between the lines, have asked me how I and we cope with this deluge and want to make sure we don’t burn in the process. I am concerned for my team mates. I might soon have to reduce my work hours to allow myself more breathing time. This is a never-before seen or experienced pressure on the curl project and its security team members. An avalanche of high priority work that trumps all other things in the project that is primarily mental because we certainly could ignore them all if we wanted, but we feel a responsibility, we have a conscience and we are proud about our work. We feel obliged to fix security problems in the software we have helped shipped to every device on the globe. This is personal to us. With about half the release cycle left until the pending release ships, we already have twelve confirmed vulnerabilities meaning twelve pending CVE announcements. That’s a new project record and it also means we will reach thirty published CVEs in 2026 even before half the calendar year has passed. The projected total amount of curl CVEs published through the whole year is therefore at least double this number! What help would we like? Short term it is a little late. We already have work up to our ears. I wish more companies that use and depend upon curl or libcurl in commercial software and services would chime in their part to fund us. We could then pay more developers to distribute the work load across. That would be great. Feel free to contact me to discuss how you can contribute to this. Get your employer to pay for a support contract! Fortunately we have customers who already do this, so some of us can work on curl full time. I am a pragmatic (and a bit of a cynic) and I have danced this dance for a long time already. I have no illusions that anything significant is going to change in this area even if we are in an unparalleled situation and in a tighter spot than ever before. I totally expect us to ride out this storm by ourselves. Like we are used to. We will survive. We will endure. It might just be a bit of a shaky period in the project and in the world at large as we try to maneuver our way through this. There’s a tsunami coming over us and all we can do is swim, there are no life boats for us. The curl project is not owned by a company. We are not part of any umbrella organization. This makes us a little under-powered at times, but it also gives us maximum freedom and flexibility. We act solely in the interest of making curl as good as possible for the world and curl users. Fixing bugs and problems is good. Every reported problem implies a fixed issue. curl becomes a better product. What is also a good trend: almost no one finds terrible vulnerabilities. All vulnerabilities found the last few years in curl have all been deemed severity LOW or MEDIUM. I’m not saying there won’t be any more HIGH ever, but at least they are rare. The most recent severity high curl CVE was published in October 2023. Right now we are under a little pressure. Forgive us if we are a little slow to respond sometimes. Image by Brian Merrill from Pixabay

0 views
iDiallo 5 days ago

Amber Alert with Spam URL?

Well that was weird. I just received an Amber Alert and the link led to a spammy looking website. The link leads to a 3gp file converter which is highly unusual. But the more I look at it, I have the impression it's a mistake. Most likely, they have exceeded the maximum number of characters for the Emergency Service alert. Here is the message: AN AMBER ALERT HAS BEEN ACTIVATED BY THE CALIFORNIA HIGHWAY PATROL. DALEZA FREGOSO WAS LAST SEEN ON MAY 24, 2026 AT 0400 HOURS IN LOS ANGELES COUNTY. THE SUSPECT VEHICLE IS A WHITE 2019 WHITE LAND ROVER DISCOVERY CA 9DAW715. CLICK ON THE LINK FOR ADDITIONAL INFORMATION. https://bit.ly/A0 It seems like the total character count is 288. I'm not sure if the title should be included but if I add and the double space after, then we have 320 characters. Is this the character limit for emergency services? When I clicked on it, it took me to the bitly preview page: And clicking on the button, I'm taken here: Suspicious Link I was starting to wonder if this was even a real Amber alert, and if somehow this was a spam message that was sent through. But unfortunately, it is a real amber alert, as I was able to find the matching alert on missingkids.com . However, I don't see a way to request a correction. I understand that bitly was often used to shorten links, but there should be a way for a service like amber alert to test those links before they are sent. At least on my android, once I click on the link, the alert is dismissed never to be seen again. When the link is incorrect, now we have this problem where we can never get the information back. In this case, I was only able to get the link because I received the alert on both my phones. Also I've learned that Amber alerts have a character limit of 360 characters . So I'm still not sure what went wrong with this one. Update: 39 minutes later (8:25pm) a second message was sent with a correction. Corrected Link Most likely a copy and paste error.

0 views
ava's blog 5 days ago

concerning law enforcement exemptions in the draft AI act transparency guidelines

I've finished reading the Draft Guidelines for transparency requirement under the AI Act that are out for comment until the 3rd of June, and a variety of exemptions for law enforcement and similar actors greatly concern me. I haven't seen media pick this up in any meaningful way, but this should be highlighted and discussed. A short explanation upfront: Transparency requirements under Art. 50 AI Act refer to providers (and some deployers) 1 of AI systems intended to interact directly with natural persons needing to make sure that the users are informed about interacting with an AI system, and outputs being marked in a machine-readable format and detectable as artificially generated or manipulated. That covers, for example, AI-enabled voice assistants, chatbots in various settings (even on social media), (humanoid) robots and AI companions, AI avatars, coding agents and agentic AI systems. Depending on the provision, transparency can be done via direct disclosure to users (such as banners, pop-ups, notices, voice announcements or chatbot messages), or machine-readable marking and detectability mechanisms for AI-generated content, sometimes complemented by visible labels or watermarks. Simply stating it in Terms of Service, documentation or else, or having a non-visible watermark, is not enough to inform users. This needs to happen at the very first interaction as latest point. Obviousness-exceptions apply. Throughout the document, law enforcement and related actors get several exemptions, starting with 3.2.2 Exception for AI systems authorised by law for law enforcement purposes , points 43-46, page 15, emphasis mine: Providers of interactive AI systems are exempted from the transparency obligation under Article 50(1) AI Act, if they are authorised by law to detect, prevent, investigate or prosecute criminal offences, subject to appropriate safeguards for the rights and freedoms of third parties. [...] To fall within this exception, the purpose of the AI system must be to detect, prevent, investigate or prosecute criminal offences (e.g. AI-undercover agent ). The exception is not restricted to the use of such AI systems only by law enforcement authorities as defined in Article 3(48) AI Act, but may also cover interactive (or generative) AI systems used by other EU or national public authorities or even private actors, such as security companies or financial institutions , so long as their use is authorised by law to detect, prevent, investigate or prosecute criminal offences and subject to appropriate safeguards to protect the rights and freedoms of third parties. Or point 87 , 4.3. Exceptions to the obligations under Article 50(2) AI Act, page 23, about labeling and detection: Finally, if a generative AI system is authorised by law to generate or manipulate synthetic content to detect, prevent, investigate or prosecute criminal offences, it will be exempted from the marking and detection requirements under Article 50(2) AI Act. Or point 103 , 5.2. Out of scope, page 26, for emotion recognition and biometric systems: The obligation does not apply to emotion recognition systems and biometric categorisation systems that are permitted by law to detect, prevent or investigate criminal offences subject to appropriate safeguards for the rights and freedoms of third parties and in accordance with Union law. Or point 117 , 6.1.4. Exception for law enforcement, page 31: If the use of a deep fake is authorised by law to detect, prevent, investigate or prosecute criminal offences, deployers are fully exempted from the transparency obligation under Article 50(4) AI Act. The way it looks right now, AI systems used by law enforcement (and related actors, like security companies!) to detect, prevent or investigate crime will be exempt from several core Article 50 transparency obligations, meaning any labeling, marking or disclosure that you are interacting with AI or that you are seeing deepfake content when it is used against you. As it stands, this enables the use AI chatbots posing as real people against investigation targets without having to tell them, and permits the use of synthetic or deepfake-style content towards targets without having to label it as such. The only exception: The bot is available to the general public and offers functionalities for people to report crimes (meaning: a police chatbot recording your complaint, virtual assistants for witness statement collection, or an AI fraud reporting hotline, for example). Obviously, officers posing as ordinary citizens, lying during proceedings and the entire concept of V-men, etc. is nothing new. However, I am deeply uncomfortable with a future in which LE and specific private actors just get a pass to deceive people with extremely convincing automated tech making this process easier and scaleable, and them having a path to create fake audiovisual material under the guise of "preventing crime", which is a rather vague and difficult to limit reason. Too much can be justified as being done for crime prevention, and it mostly hits people who are innocent or not convicted of a crime (yet), and also affects their friends and family members. With the opening clause about law authorizations, member states could create authorizations allowing banks, fraud-monitoring firms, telecom providers, or platform operators to deploy undisclosed AI interactions or unlabeled synthetic content in quasi-law-enforcement settings just under the guise of detecting, preventing or investigating crime. The line between criminal investigation, compliance monitoring and fraud prevention is being blurred in a way that heavily puts us at a disadvantage. While the guidelines say that the authorization law must specify purposes, circumstances, and safeguards and respect the rights of third parties, there is not yet a definition of any minimum substantive safeguards, nor do they require independent judicial authorization every time. Most often, rights of third parties means things like copyright. The mentioned exemptions, in my view, aid the creation of an environment of distrust online that the transparency requirement otherwise seeks to prevent. They circumvent safeguards against deception, impersonation and manipulation for the most powerful coercive actors we have! We require transparency because of risks to democratic processes and societal trust, but the exemptions remove those safeguards exactly where we are least able to contest or verify what is happening. It will become harder for defendants, journalists, oversight bodies, and other investigators to determine whether evidence, communications, or media were AI-generated or manipulated when LE AI meddled in it while unmarked and undisclosed. If a conversational AI used in an investigation hallucinates, misleads, escalates emotional pressure, or incorrectly infers intent, then that will that negatively and unfairly affect the outcome of the investigation. At minimum, people should not unknowingly interact with highly persuasive synthetic systems capable of impersonation and emotional manipulation by (quasi-)policing actors. They deserve not having to constantly ask themselves whether something or someone they are interacting with is real, and possibly has LE manipulation behind it. The scale of deception the tech enables is intense, down to covert persuasion, emotional manipulation, or inducement, and we shouldn't just let cops (and wannabe-cops) have that unchallenged, with barely any oversight or limits. I understand that for certain targets, transparency is ruining an investigation (child exploitation investigations, counterterrorism infiltration, etc.) but I expect this could increase risks of entrapment and manipulative practices, and an increase of chilling effects online as people adjust their behavior accordingly. This should not be adopted like that without a lot of work addressing these issues and limiting the exceptions to specific cases. Reply via email Published 25 May, 2026 Providers are natural or legal persons, public authorities, agencies or other bodies that develop AI systems, or have them developed, and place them on the Union market (ex: OpenAI). Deployers are natural or legal persons, public authorities, agencies or other bodies using AI systems under their authority (ex.: universities that supply AI models to their students). ↩ Providers are natural or legal persons, public authorities, agencies or other bodies that develop AI systems, or have them developed, and place them on the Union market (ex: OpenAI). Deployers are natural or legal persons, public authorities, agencies or other bodies using AI systems under their authority (ex.: universities that supply AI models to their students). ↩

0 views

Netherlands Seizes 800 Servers, Arrests 2 for Aiding Cyberattacks

Authorities in the Netherlands have arrested the co-owners of two related Internet hosting companies for operating IT infrastructure used by Russia to carry out cyberattacks, influence operations and disinformation campaigns inside the European Union. The two men were the focus of a 2025 KrebsOnSecurity story about how their hosting companies had assumed control over the technical infrastructure of Stark Industries Solutions , an Internet service provider sanctioned last year by the EU as a frequent staging ground for cyber mischief from Russia’s intelligence agencies. An investigator with the Tax Intelligence and Investigation Service (FIOD), the Dutch financial crimes agency, during the raid. Image: FIOD. The Dutch daily news outlet de Volkskrant reports that the Dutch financial crime agency FIOD on May 18 arrested a 57-year-old from Amsterdam and a 39-year-old from The Hague, charging them with violating sanctions law by directly or indirectly making economic resources available to EU-sanctioned entities. The Dutch investigation focuses on Stark Industries, a sprawling hosting provider that materialized just two weeks before Russia invaded Ukraine. As detailed in this May 2024 deep-dive , Stark quickly became the source of massive distributed denial-of-service (DDoS) attacks against European targets, and emerged as a top supplier of proxy and anonymity services that showed up time and again in cyberattacks linked to Russia-backed hacking groups. That report identified two Moldovan brothers — Ivan and Yuri Neculiti and their company PQHosting — who were providing one of Stark’s two main conduits to the larger Internet. In May 2025, the EU sanctioned PQHosting and the Neculiti brothers for aiding Russia’s hybrid warfare efforts. But as KrebsOnSecurity observed in September 2025 , those sanctions failed to target Stark’s remaining connection to the Internet — an Internet service provider based in the Netherlands called MIRhosting . MIRhosting is operated by Andrey Nesterenko , a 39-year-old Russian native who runs the business out of the Netherlands.  News that PQHosting and the Neculiti brothers were about to be sanctioned by the EU leaked in the media nearly two weeks before the sanctions were announced last year. During that time, the Stark network assets were transferred from PQHosting to a new entity called the[.]hosting , under the control of the Dutch entity WorkTitans BV . And as our September 2025 report showed, WorkTitans was controlled by Nesterenko and a 57-year-old from Amsterdam named Youssef Zinad . On top of that, WorkTitans was getting connectivity to the larger Internet solely through MIRhosting, where Zinad had worked previously. On May 18, Dutch financial crime investigators arrested Nesterenko and Zinad, and searched three businesses in Enschede and Almere and two data centers in Dronten and Schiphol-Rijk. A statement from the Dutch authorities said they also seized laptops, telephones and more than 800 servers. A message to the-hosting customers immediately after 800 of its servers were seized by Dutch authorities. The message says that unfortunately data stored on the server has been lost and cannot be recovered. De Volkskrant said it reviewed data showing WorkTitans and MIRhosting were the most-used networks in pro-Russian attacks on Danish government bodies between November 13 and 19, 2025, the week of Denmark’s municipal elections. The publication wrote that prior to Nesterenko’s arrest, the MIRhosting founder denied that he knew his servers had been misused by pro-Russian cybercriminals. “He said he had ended all services with the Neculiti brothers when the EU sanctions came into force in May 2025,” and the he “reserved all rights to take action against ‘harmful and incorrect publications,” de Volkskrant wrote. MIRhosting released a statement saying it has initiated an internal investigation into the alleged facts concerning the elections in Denmark, and that it has temporarily paused services to WorkTitans as a precautionary measure while the matter is being reviewed further. “Based on our preliminary findings, there are no indications that the services over which we exercise control were actually used to influence the Danish elections,” the statement reads. “No anomalies or spikes were observed in our network traffic during the period mentioned in the publication; had large-scale DDoS attacks occurred, such activity would have been evident. Furthermore, prior to the media publication, we had not received any complaints, abuse reports, or official requests regarding suspicious activities or misuse of our network. Meanwhile, our regular operational activities continue, and our service to our other clients remains fully intact.” Born in Nizhny Novgorod, Russia, Mr. Nesterenko grew up as a piano prodigy who performed publicly at a young age. In 2004, Nesterenko founded MIRhosting’s parent Innovation IT Solutions Corp. , which has the notable distinction of being the company responsible for hosting stopgeorgia[.]ru, a hacktivist website for organizing cyberattacks against Georgia that appeared at the same time Russian forces invaded the former Soviet nation in 2008. That conflict was thought to be the first war ever fought in which a notable cyberattack and an actual military engagement happened simultaneously. Responding to questions shared via email, Nesterenko said MIRhosting does not support cybercrime, sanctions evasion, or illegal activity, and that the allegations and arrest by Dutch authorities have been extremely harmful to him and his company. “The transition to the.hosting was not intended to evade sanctions,” Nesterenko wrote. “The hardware and customer portfolio had already been transferred to WorkTitans before the sanctions appeared. Closing or damaging a legitimate Dutch infrastructure company will not stop cybercrime, but it will harm many people who have done nothing wrong.” Far less is public about the 57-year-old Zinad, who reportedly has been keeping a low profile since our story last year. De Volkskrant reported that Zinad blocked access to his LinkedIn account, had gone months without responding to emails, WhatsApp messages and phone calls, and told a colleague that illness was forcing him to lead a somewhat more reclusive life. Mr. Zinad’s now-defunct LinkedIn profile. It was full of posts for MIRhosting’s services. Mr. Nesterenko claims Zinad was never an employee of MIRhosting. “He helped me and MIRhosting with certain business tasks under a normal business-to-business arrangement between companies,” Nesterenko explained. However, in previous emails to KrebsOnSecurity, Nesterenko carbon copied Mr. Zinad (who had a @mirhosting.com email), explaining that he was part of the company’s legal team. Also, the Dutch website stagemarkt[.]nl lists Youssef Zinad as an official contact for MIRhosting’s offices in Almere. Mr. Zinad has never responded to requests for comment. Nor did de Volkskrant have any luck tracking him down. The publication said it repeatedly asked Mr. Zinad (referred to here as simply “Z”), but he reportedly avoided every form of contact. “‘I am unavailable but will respond to your message as soon as possible,’ reads an automated reply on WhatsApp on 2 October 2025,” de Volkskrant reported. “It is the only response de Volkskrant would receive in months. He did not pick up his phone and did not call back. When an acquaintance asked him via LinkedIn to contact the reporter, he blocked access to his LinkedIn page. At an address in Almere where Z.’s personal limited company is registered, no one was present in April. The corner house’s blinds were drawn, and a pile of rubbish bags lay outside next to a container, as if someone had recently left. A neighbour said he knew the man but did not know where he was staying. Z. was later arrested at a residence in Amsterdam.”

0 views

How my minimal, memory-safe Go rsync steers clear of vulnerabilities

Back in January 2025, multiple different security researchers published a total of 6 security vulnerabilities in rsync , some of which allow arbitrary code execution and file leaks, so naturally I was wondering whether/how my gokrazy/rsync implementation was affected. Did implementing my own (compatible, but minimal) rsync in Go, a modern and memory-safe programming language, really rule out entire classes of security vulnerabilities? This deep dive article was in the making since January 2025, but was delayed because we uncovered more unpublished vulnerabilities in the process! The “Security Vulnerabilities” section now covers all 12 vulnerabilities from the January 2025 batch and the May 2026 batch. If you are running (upstream, samba) rsync in production, upgrade to version 3.4.3 or newer. If you are running gokrazy/rsync in production, upgrade to version v0.3.3 or newer. Feel free to skip over the nitty-gritty security issue details and jump directly to: For context, I blogged about rsync, how I use it, and how it works back in June 2022. See also all posts tagged “rsync” . The original motivation for writing my own rsync (back then only a server, today all directions are supported) was to provide the software packages of distri, my Linux distribution research project for fast package management , which I wanted to host on router7 , my small home Linux+Go internet router, which in turn is built on gokrazy , my Go appliance platform. I am still running multiple gokrazy/rsync servers for this original purpose, and also many others! Having rsync available as a primitive (that you can link into your Go programs!) is really nice. This article covers the following security vulnerabilities: The first batch of the vulnerabilities above was announced on the oss-security mailing list , but note that the original report has more detail compared to the oss-security summaries! The later vulnerabilities were announced via GitHub Security Advisories on the rsync project . When the checksums are read by the daemon, two different checksums are read: Most importantly, note that field is filled with bytes. always has a size of 16: rsync.h is an attacker-controlled value and can have a value up to bytes, as the next snipper shows: The problem here is that can be larger than 16 bytes, depending on the digest support the binary was compiled with: md-defines.h support is common and sets the value to 64. As a result, an attacker can write up to 48 bytes past the buffer limit. Upstream fix: The upstream fix for CVE-2024-12084 changes the field to a dynamically-allocated field, which is allocated with length, and fixes the bounds check to check against the (checksum length for this transfer’s algorithm). Can Go help prevent this? Yes: Missing or incorrect bounds checks will not result in a heap buffer overflow in Go! Instead, attempting to write out of bounds will result in a panic because the Go runtime performs bounds checks. How does gokrazy/rsync fare? gokrazy/rsync also had insufficient validation! Our issue was different, though: It wasn’t size confusion, we just were not doing any validation of the sum header at all — oops! We can confirm that the Go runtime’s bounds check triggers on an attempt to write out of bounds by changing the code like so and running the tests: As expected, the Go runtime panics with the following message: Of course, crashing the entire server is not the best failure mode, so I added the missing bounds checking to turn the panic into an error . Because of the same lack of validation as in the previous CVE-2024-12084 vulnerability, an attacker could select a checksum algorithm with short checksums (e.g. with 8 byte checksums), but then claim they were sending longer checksums (e.g. 9 bytes), making the victim leak one byte of uninitialized stack content in the response. Leaking one byte of stack content may seem benign, but as the Google Security report puts it: The first pair of vulnerabilities are a Heap Buffer Overflow and an Info Leak. When combined, they allow a client to execute arbitrary code on the machine a Rsync server is running on. The client only requires anonymous read-access to the server. The daemon matches checksums of chunks the client sent to the server against the local file contents in . Part of the function prologue is to allocate a buffer on the stack of bytes: The daemon then iterates over the checksums the client sent and generates a digest for each of the chunks and compares them to the remote digest: Notably, the number of bytes that are compared again are bytes. In this case, the comparison does not go out of bounds since can be a maximum of . However, the local buffer, not to be confused with the attacker-controlled , is a buffer on the stack that is not cleared and thus contains uninitialized stack contents. A malicious client can send a (known) checksum for a given chunk of a file, which leads to the daemon writing 8 bytes to the stack buffer . The attacker can then set to 9 bytes. The result of such a setup would be that the first 8 bytes match and an attacker-controlled 9th byte is compared with an unknown value of uninitialized stack data. An attacker can divide a file into 255 chunks and as a result leak one byte per file download. An attacker can incrementally repeat the process, either in the same connection or by resetting the connection. As a result, they can leak bytes of uninitialized stack data, which can contain pointers to Heap objects, Stack cookies, local variables and pointers to global variables and return pointers. With those pointers they can defeat ASLR. Upstream fix: There are two relevant upstream fixes: Can Go help prevent this? Yes: By design, Go initializes all variables to the zero value. Go programmers do not need to remember to explicitly initialize variables. How does gokrazy/rsync fare? gokrazy/rsync is not affected by this vulnerability: Variables are always initialized in Go. Additionally, selecting checksums other than MD4 was only introduced in protocol version 30 (gokrazy/rsync implements protocol version 27). Description: (quoting the Google Security report ) When the syncing of symbolic links is enabled, either through the or ( ) flags, a malicious server can make the client write arbitrary files outside of the destination directory. A malicious server can send the client a file list such as: Symbolic links, by default, can be absolute or contain characters such as . In practice, the client validates the file list and when it sees the entry, it will look for a directory called , otherwise it will error out. If the server sends as [both, a directory and a symbolic link], [the client] will only keep the directory entry, thus the attack requires some more details to work. In mode, which the server can enable for the client, the server sends the client multiple file lists. The deduplication of the entries happens on a per-file-list basis. As a result, a malicious server can send a client multiple file lists, where: As a result, the directory is created first and is considered a valid entry in the file list. Then, the attacker changes the type of to a symbolic link. When the server then instructs the client to create the file, it will follow the symbolic link and thus files can be created outside of the destination directory. Can Go help prevent this? No. This vulnerability is caused by a logic error: when multiple file lists are used, the merged file list needs to be re-verified. But see Defense in depth: Go’s Upstream fix: The upstream fix for CVE-2024-12087 adds the missing validation. How does gokrazy/rsync fare? gokrazy/rsync is not affected by this vulnerability: gokrazy/rsync does not implement the incremental recursion mode ( ). The trade-off here is implementation complexity vs. resource usage: the incremental recursion mode allows working with the file set in a “windowed” way, as opposed to having to scan the entire file set before any transfer can begin. See also my How does rsync work? blog post. Description: (quoting the Google Security report ) The CLI flag makes the client validate any symbolic links it receives from the server. The desired behavior is that symbolic links target can only be 1) relative to the destination directory and 2) never point outside of the destination directory. The function is responsible for validating these symbolic links. The function calculates the traversal depth of a symbolic link target, relative to its position within the destination directory. As an example, the following symbolic link is considered unsafe: As it points outside the destination directory. On the other hand, the following symbolic link is considered safe as it still points within the destination directory: This function can be bypassed as it does not consider if the destination of a symbolic link contains other symbolic links in the path. For example, take the following two symbolic links: In this case, foo would actually point outside the destination directory. However, the function assumes that is a directory and that the symbolic link is safe. Upstream fix: The upstream fix for CVE-2024-12088 makes stricter by not allowing anywhere within the path, except at the very beginning. Can Go help prevent this? No. This vulnerability is caused by a logic error: the validation function was incorrect. We could have implemented that same bug. But see Defense in depth: Go’s How does gokrazy/rsync fare? gokrazy/rsync is not vulnerable: The feature is not yet implemented in gokrazy/rsync. The rsync receiver (in client mode) did not sanitize file names provided by the rsync sender, or otherwise prevent opening files outside the destination tree. A malicious sender could instruct a receiver to compare checksums of arbitrary files outside the destination tree. By observing the receiver’s reaction to a provided one-byte checksum, a malicious sender can leak arbitrary files. When a client connects to a malicious server the server is able to leak the contents of an arbitrary file on the client’s machine. In the client will read type as well as the from the server if the server sets the appropriate flags. The flag will not be set for the client. The caller ( ) then uses the server provided values to determine a file to compare the incoming data with. In the contents of the file specified by are copied into the destination file. This can be achieved by the server sending a negative token. The server sends a checksum to compare. If they don’t match, a 0 is returned. When the return value is 0 the receiver will then send a to the generator. The generator will then write a message to the server. The server can use this as a signal to determine if the checksum they sent was correct. By starting off with a of 1 a malicious server is able to determine the contents of the target file byte by byte. Upstream fix: The upstream fix for CVE-2024-12086 prevents opening files outside the destination tree by verifying the sender-provided path. Can Go help prevent this? Yes, Go offers an API to prevent this, see Defense in depth: Go’s . How does gokrazy/rsync fare? gokrazy/rsync is not vulnerable: the fuzzy matching feature was introduced with rsync protocol version 29, but gokrazy/rsync implements protocol version 27. Description: (quoting the Red Hat Security Advisory ) A flaw was found in rsync. This vulnerability arises from a race condition during rsync’s handling of symbolic links. Rsync’s default behavior when encountering symbolic links is to skip them. If an attacker replaced a regular file with a symbolic link at the right time, it was possible to bypass the default behavior and traverse symbolic links. Depending on the privileges of the rsync process, an attacker could leak sensitive information, potentially leading to privilege escalation. Upstream fix: The upstream fix for CVE-2024-12747 changes calls in the rsync sender to use the option. The paths are not expected to be symlinks at that point in the algorithm (symlinks would be handled with ). Can Go help prevent this? Yes, Go offers an API to prevent this, see Defense in depth: Go’s . How does gokrazy/rsync fare? gokrazy/rsync was vulnerable before commit , which introduces the same mitigation that upstream rsync uses. To reproduce the issue, use the following steps: Check out gokrazy/rsync v0.2.7: Patch the code as follows to undo the fix and execute the attack: Running the test now shows that the server traversed the symlink: A surprising discovery When I shared a draft of this article with Damien Neil, member of the Go Security Team and the author of the traversal-resistant API , he pointed out: I believe the gokrazy fix for CVE-2024-12747 is insufficient. You’re calling with , but only prevents symlink traversal in the last path component. This is probably still vulnerable to replacing an earlier path component so can be redirected by symlinking to . We reported this to the rsync security contact address in April 2025. In December 2025 I learned that someone else had also independently discovered and reported this issue. Ultimately, this resulted in CVE-2026-29518, published on 2026-05-20. Description: (quoting the rsync 3.4.3 NEWS entry ) TOCTOU symlink race condition allowing local privilege escalation in daemon mode without chroot. An rsync daemon configured with is exposed to a time-of-check / time-of-use race on parent path components. A local attacker with write access to a module can replace a parent directory component with a symlink between the receiver’s check and its open(), redirecting reads (basis-file disclosure) and writes (file overwrite) outside the module. Under elevated daemon privilege this allows privilege escalation. Default is not exposed. Reach: local attacker on the daemon host, write access to a module path, daemon configured with . Upstream fix: The upstream fix for CVE-2026-29518 uses , which is similar to Go’s API. Can Go help prevent this? Yes, Go offers an API to prevent this, see Defense in depth: Go’s . How does gokrazy/rsync fare? gokrazy/rsync was vulnerable until I switched the sender and the receiver to the traversal-resistant API . Description: (quoting the GitHub Security Advisory ) Description: The receiver’s compressed-token decoder accumulated a 32-bit signed counter without overflow checking. A malicious sender can trigger an overflow that, with careful manipulation, leaks process memory contents to the attacker – environment variables, passwords, heap and library pointers – significantly weakening ASLR and facilitating further exploitation. Reach: authenticated daemon connection with compression enabled (the default for protocols >= 30 when both peers advertise it). Disabling compression on the daemon (“refuse options = compress” in rsyncd.conf) is the available workaround. Upstream fix: The upstream fix for CVE-2026-43618 introduces the missing checks. How does gokrazy/rsync fare? gokrazy/rsync is not vulnerable because it does not implement compression. See gokrazy/rsync issue #35 for details on why compression support sounds simple, but is non-trivial. Description: (quoting the GitHub Security Advisory ) The 2025 fix that added a guard in was not applied to the visually-identical block in . A malicious rsync server can drive any connecting client into a deterministic by setting in the compatibility flags, sending a flist whose first sorted entry is not a leading “.” directory (which causes to set ), then sending a transfer record with and a non- iflag word. The receiver reads and dereferences the result. On glibc x86-64 the dereferenced pointer is mmap chunk metadata that lands at an unmapped address, hence a clean ; non-glibc allocators have not been audited. Reach: any rsync client doing a normal pull from an attacker-controlled URL. Works for both rsync:// URLs and remote-shell pulls. is the protocol-30+ default; no special options are required on the victim. Workaround: on the client. Upstream fix: The upstream fix for CVE-2026-43620 adds the guard to as well. How does gokrazy/rsync fare? Just like for CVE-2024-12087 , gokrazy/rsync is not affected by this vulnerability: gokrazy/rsync does not implement the incremental recursion mode ( ). Description: (quoting the GitHub Security Advisory ) Description: Earlier fixes for symlink races on the receiver’s open() call (CVE-2026-29518) missed the same race class on every other path-based system call: chmod, lchown, utimes, rename, unlink, mkdir, symlink, mknod, link, rmdir, lstat. On rsync daemons with “use chroot = no” a local attacker with filesystem access on the daemon host can swap a symlink into a parent directory component between the receiver’s check and one of these syscalls, redirecting it outside the exported module. The fix routes each affected path-based syscall through a parent dirfd opened under RESOLVE_BENEATH-equivalent kernel-enforced confinement (openat2 on Linux 5.6+, O_RESOLVE_BENEATH on FreeBSD 13+ and macOS 15+, per-component O_NOFOLLOW walk elsewhere). Default “use chroot = yes” is not exposed. Reach: local attacker on the daemon host, write access to a module path, daemon configured with use chroot = no. Upstream fix: The upstream fix for CVE-2026-43619 uses the family of syscalls, just like Go’s . Can Go help prevent this? Yes, Go offers an API to prevent this, see Defense in depth: Go’s . How does gokrazy/rsync fare? gokrazy/rsync is not affected, because it uses Go’s API throughout. Description: (quoting the GitHub Security Advisory ) On an rsync daemon configured with the global rsyncd.conf setting, the reverse-DNS lookup of the connecting client was performed after the daemon had chrooted into . If did not contain the files glibc needs for resolution ( , , , NSS service modules), the lookup failed and the connecting hostname was set to “UNKNOWN”. Hostname-based deny rules (“hosts deny = *.evil.example”) therefore could not match, and an attacker controlling their PTR record could connect from a hostname the administrator had intended to deny. IP-based ACLs are unaffected. The per-module setting is unrelated to this issue. Reach: rsync daemon configured with AND hostname-based ACLs AND does not include the libc resolver fixtures. Upstream fix: The upstream fix for CVE-2026-43617 moves the DNS lookup to an earlier point in the protocol. How does gokrazy/rsync fare? gokrazy/rsync is not vulnerable because we only implement IP-based allow/deny lists, not hostname-based allow/deny lists. Description: (quoting the GitHub Security Advisory ) The rsync client’s HTTP proxy support contains an off-by-one out-of-bounds stack write in ( ). After issuing the request, rsync reads the proxy’s first response line one byte at a time into a 1024-byte stack buffer with the bound , so the loop only ever writes . If the proxy (or a man-in-the-middle in front of it) returns 1023+ bytes on the first response line without a terminator, the loop exits with — a slot the loop never wrote, so holds stale stack bytes left there by the earlier that formatted the outgoing request. The post-loop code then does: The lands one byte past the end of the on-stack , corrupting whatever lives in the adjacent stack slot. AddressSanitizer reports at in the frame. Upstream fix: The upstream fix for CVE-2026-45232 validates the attacker-supplied data. How does gokrazy/rsync fare? gokrazy/rsync does not implement such proxy support, so it is not vulnerable. Let’s summarize how Go fares: Aside from being written in Go, another key difference between gokrazy/rsync and the official upstream rsync is that the gokrazy implementation is minimal : Let’s have a look at whether gokrazy/rsync was affected by each CVE at the time of publishing: To be clear: all known vulnerabilities are fixed in gokrazy/rsync! The table above documents what the state was at the time when each CVE was published. In other words: When the January 2025 vulnerabilities were published, gokrazy/rsync panicked (CVE-2024-12084) and was vulnerable to a TOCTOU race (CVE-2024-12747). In the process of fixing the TOCTOU issue, we discovered CVE-2026-29518, which was fixed in gokrazy/rsync before the CVE was published. CVE-2026-43619 was discovered even later, but was also already fixed in gokrazy/rsync with the same fix: using Go’s everywhere. As I was reading the vulnerability reports, I noticed that the reports were slightly misleading by their choice of words: most reports just spoke of “server” and “client”. However, in an rsync transfer, both sides, the rsync client and the rsync server can assume either role: sender (upload files) or receiver (download files)! Some setups come with further restrictions that make certain attacks harder or impossible to pull off. For example, when running in daemon mode, file system access can be restricted to the pre-configured module paths (but not in command mode!). Here is a diagram to give you an overview of the 4 different setups and role/protocol layering: In the context of our vulnerability reports, I would say that the Arbitrary File Leak vulnerability (CVE-2024-12086)’s original title “Server leaks arbitrary client files” can easily be misunderstood. Instead, I would say: The rsync receiver will leak arbitrary files to a malicious sender . I have verified that a malicious client sender can make an unpatched remote rsync open files outside the destination tree (e.g. the system password database) when running in command mode, for example over SSH. (But, when running in daemon mode, the server enables additional path sanitization, which prevents this attack.) Similarly, the Symlink Path Traversal vulnerability (CVE-2024-12087) speaks about a “malicious server”, but again, it should be “malicious sender”, which can be either the client or the server. The OpenBSD project is known for its security focus, so how does openrsync compare? openrsync is not affected by the Heap Buffer Overflow (CVE-2024-12084) and Stack Info Leak (CVE-2024-12085) vulnerabilities because it validates the checksum length and only supports one checksum size/algorithm (MD4). openrsync is not affected by CVE-2024-12086, CVE-2024-12087 and CVE-2024-12088 because it does not implement the relevant features (like gokrazy/rsync). Even if it was vulnerable, openrsync’s defense-in-depth measures like using OpenBSD’s and to restrict file system access would have prevented successful exploitation — at least when running on OpenBSD. openrsync is not affected by CVE-2024-12747 because it used from the very moment they implemented symlink support . But, because is not a sufficient fix for this issue, openrsync is affected by CVE-2026-29518! The above covers the January 2025 batch of vulnerabilities; the May 2026 batch is similar in that most features just are not implemented. Overall, I say: Well done, Kristaps and contributors! By diligently implementing validation, restricting the attack surface and employing defense-in-depth measures, openrsync manages to not be affected by almost all of the reported vulnerabilities. Which APIs and environments can we use on Linux for defense-in-depth measures? I’ll go through the ones supports, ordered by traditional to modern. Within a few weeks after starting the project, I added support for dropping privileges and using mount/pid namespaces on Linux to restrict the file system objects that my rsync server could work with. This approach works very well to mitigate path traversal attacks, but requires privileges, meaning we need to run as or in a Linux user namespace (if enabled on your distribution / system). That limitation makes mount namespaces well-suited for server setups, but usually unavailable for interactive one-off transfers that are typically running under a human’s user account. In the same commit that introduced Linux mount/pid namespace support, I also included a systemd service file that restricted file system access to home directories and encouraged folks in the README to further restrict file system access, depending on what their use-case allows. These file system restrictions, if set up correctly, mitigate the File Leak (CVE-2024-12086) and Path Traversal (CVE-2024-12087) vulnerabilities. The Symlink Race Condition (CVE-2024-12747) relies on privilege escalation through the rsync process, but thanks to the DynamicUser feature, our process has fewer privileges than other users. Similarly to mount namespaces, these measures are great for server setups, but too cumbersome to set up for interactive one-off usages. I stumbled upon Justine’s blog post Porting OpenBSD pledge() to Linux (2022) and was reminded that Linux offers the Landlock API for unprivileged, per-process access control, similar to OpenBSD’s system call, which openrsync uses. The basic idea is that once your program knows the directory it works with, it makes a call like and no longer has access to other file system locations. I had previously heard of Landlock at a Go Meetup, so I knew there was Go support for Landlock. Back in 2022, I enabled Landlock support in the gokrazy kernel images. So I gave it a shot in March 2025 and implemented Landlock support to restrict file system access . It took me a few hours, which seems a little longer than one might expect at first. Making Landlock work (and/or skipping it) in our test environment ran into a couple of road blocks: Our tests had defined many functions that get run in the same process, but when repeatedly adding rulesets, we would exceed the limit of 16 (!) policy layers per process. Once I had it set up just right, it is a beautiful solution. Now we can restrict rsync transfers to their sources (read-only) or destination directories (read-write), even for unprivileged invocations of ! 🎉 The downside to Landlock is that Landlock operates at the process level. This means that Landlock policies must include the files that your program needs, e.g. needs to be able to read for user id lookup, so if the attacker is after the file, Landlock does not help. In February 2025, the Go 1.24 release introduced the API, which is resistant against path traversal, see The Go Blog: Traversal-resistant file APIs (by Damien Neil, March 2025). This API allows more fine-grained control (per file system operation) compared to Landlock. Go 1.25 (released in August 2025) added more methods to , making it a convenient choice for most file system usage. I have converted all of ’s file system usage to use , which is a great fit: users configure input/output directories, but the filenames received over the network are untrusted. That’s exactly what was designed for! When I first looked into using , I thought that some system calls could inherently not be made with this API, like for example to create device node files. Damien explained: It won’t support mknod, though. However, you should be able to use it to enable a safe mknod: If you’re curious how that looks in practice, check out ’s usage in , line 15-29 . Another stumbling block was when I realized that unlike with , Linux only implements , but no (as of Linux 7.0)! Luckily, Lennart Poettering pointed out that there’s a trick to skip path resolution without : you can probably bind to in the meantime… And indeed, this works! Path resolution is skipped because we only specify a basename (last component of a path) after the known-safe , not a path (see line 49-56 ). With these two tips, v0.3.1 and newer are fully using , meaning all file system access is traversal-safe! 🥳 Lacking validation causes vulnerabilities It is interesting to note that aside from the TOCTOU vulnerabilities (CVE-2024-12747, CVE-2026-29518 and CVE-2026-43619), all other vulnerabilities were caused by missing or incorrect input validation. In three cases, there was just no validation to begin with. In another case (CVE-2024-12088), the subject matter of file system path resolution is tricky enough that the existing validation did not cover all edge cases. As the Go verdict section explains in more detail, the most valuable structural fixes are to provide bounds checking (= always-on validation) and safe-by-default APIs like Go’s . Too much complexity A few of the vulnerabilities came from evolution of the rsync protocol: The code used to correctly perform sufficient validation, but then new features were added. For example, when checksum algorithm negotiation was added (protocol version 30), the validation was not correctly updated. When incremental recursion was added (also protocol version 30), the validation that made sense for individual file lists was not updated for the new processing approach of merging incremental file lists. Avoiding complexity avoids vulnerabilities! Both gokrazy/rsync and also openrsync were not vulnerable to 8 out of the 12 security vulnerabilities simply because they do not implement the feature with the vulnerability. Of course, these features were added to rsync because they were valuable to someone at some point, and of course I am not saying that we should just… not develop software any further, ever. But, I consider it ideal to use an implementation whose complexity is appropriate for and proportional to the complexity of the use-case . In other words: for simple use-cases, reach for a simple implementation. Only reach for the fully-featured implementation where needed. The verdict on whether using Go has helped . The verdict on whether a minimal re-implementation like gokrazy/rsync helps . My comparison with OpenBSD’s (written in C). Defense in depth mechanisms one can use on Linux. The conclusion . CVE-2024-12084 to 12088 (original report) CVE-2024-12747 (discovered separately by Aleksei Gorban “loqpa”) CVE-2026-29518 (discovered by Damien Neil and myself! and independently by Nullx3D ) CVE-2026-43617 to 43620 CVE-2026-45232 rsync performed insufficient validation: It read the (attacker-controlled) checksum length from the network and compared the length against . However, rsync’s data structures always declared a 16 byte buffer: is always 16 (bytes), which is sufficient to hold an MD4 or MD5 checksum. used to be 16 (bytes), but can be larger when rsync is compiled with SHA256 or SHA512 checksum support. Hence, the bounds check was ineffective! An attacker could write out of bounds. This issue was introduced with commit in September 2022 , which added SHA256/SHA512 checksum support. A 32-bit Adler-CRC32 Checksum A digest of the file chunk. The digest algorithm is determined at the beginning of the protocol negotiation. The corresponding code can be seen below: sender.c : The “Some checksum buffer fixes” commit prevents this attack because the attacker-controlled can no longer be larger than the transfer’s checksum length. The “prevent information leak off the stack” commit initializes the memory to zero, thereby making any stack leak through impossible. Check out gokrazy/rsync v0.2.7: Patch the code as follows to undo the fix and execute the attack: The Go runtime’s bounds checks turn more serious security issues into a panic. A panic is still a denial-of-service risk, but that’s much preferable. Go initializes memory to zero, making info leaks like CVE-2024-12085 impossible. Go’s API prevents most of the remaining vulnerabilities. Only one out of twelve vulnerabilities (CVE-2026-43617) is a proper bug in the application logic that using Go could not have prevented. gokrazy/rsync is unaffected by many vulnerabilities because it does not implement the feature in question, for example . Like all other wire protocol-compatible rsync implementations, gokrazy/rsync targets protocol version 27, because later protocol versions introduce significant complexity. In some cases, features that would be good to implement come with significant blockers, e.g. compression is tricky, see gokrazy/rsync issue #35 for details. os.Root.OpenFile the parent directory of the target, File.Fd to get the file descriptor for that directory, https://pkg.go.dev/golang.org/x/sys/unix#Mknodat to create the file.

0 views
ava's blog 1 weeks ago

beware of EU-washing

Among all this talk of European sovereignty and switching to European alternatives in a move to better privacy and less support of Big Tech, I wish for more emphasis on not just blindly copying US products and slapping an EU label on it. I see news like the Germany’s Federal Office for the Protection of the Constitution backing away from using Palantir and using a software solution from France instead. I’m supposed to feel happy reading this, and admittedly I did not yet dig into ArgonOS deeply - but all I can think of as a first reaction is “I don’t want an EU version of Palantir.” I don’t want ‘GDPR-compliant’ facial recognition and behavioral surveillance in our cities. I don’t want more privacy-friendly warfare (???). I don’t want more tech-enabled discrimination from next door. I don’t want supposedly European alternative that’s still based on AWS and Microslop. We need to be critical and take a stand against EU-washing, in which unethical business concepts or structures get painted in a more ethical light using the (increasingly less warranted) good reputation of the EU about human rights. We aren’t better for being from a different area, or just because it’s a different company name slapped on; it’s because we are supposed to have strong consumer protections and rights, resist the promise of easy money through unlimited data mining, and stand up against fascism. I don’t want us to compete with evil; I don’t want us to stoop to that level at all. Go hard on these copycats. Taking concepts from Fascism Land isn’t worthy of praise and they don’t deserve you as a customer or fan. Make them prove it first and ask them the hard questions. Boycott their shit if it is the same garbage, go to protests, write to representatives, be vocal online, support NGO’s that work against this. No one gets a pass for being European. I won’t lower my standards and values. Reply via email Published 24 May, 2026

0 views
ava's blog 1 weeks ago

computers, privacy and data protection conference 2026

I attended the Computers, Privacy and Data Protection Conference (CPDP) in Brussels for the first time. The conference has lots of different rooms mostly in the same building where multiple panels, workshops and other things are happening at the same time in specific slots, so you gotta choose what you participate in (was difficult at times!). Next to that, you have some fun rooms, some quiet working spaces and spaces to just hang out and talk. Based on the programme, the focus this year was definitely on age verification/youth 'protection', human AI relationships, consumer rights and marginalized groups. Lots of different groups and people present; people from the EU Commission and Parliament, AlgorithmWatch , Bits of Freedom , noyb and Max Schrems, IGLYO , EDRi , Equilabs , Equinox Initiative for Racial Justice , INTITEC , the EDPS and Wojciech Wiewiórowski, Privacy International , the International Committee of the Red Cross , the Office of the United Nations High Commissioner for Human Rights , the European Consumer Organization (BEUC), Future of Privacy Forum , AIRegulation.com , data protection authorities of different countries (CNIL, BFDI, etc.), ALTI , European Disability Forum , d.pia.lab , AI Now Institute , OECD , the IAPP , and all kinds of universities, plus companies like Mozilla, Mastodon, Signal, Wikimedia, Microslop, Uber, TikTok, Google and more. I was there for the opening remarks, then went on to visit: My takeaways/new things learned: Microsoft co-wrote parts of the EU's Energy Efficiency Directive , which allows data centers to keep their energy use confidential under the guise of business secrecy. The draft literally had paragraph's of Microsoft's proposal copied in unchanged. The Dutch government used racial/ethnic profiling via algorithms in the assessment of childcare benefit applications, which led to false allegations of fraud against thousands of families, particularly affecting those from ethnic minorities. I heard about this before, but learned more about it that day. To contest it all and defend democracy, we all need to train our AI literacy skills , support and have good tech journalism that questions and exposes it all (404media is, imo, a good example of what they meant), crafting and changing the social media narrative around AI and Big Tech, listening to affected people, demanding transparency via standards and audits etc. We cannot forget that officials know ; many of the effects we criticize are not accidents or side effects, they are the entire point. Like when tech predominately negatively targets marginalized communities, this is a bonus to people in power, and nothing to be fixed. Workers can resist by reminding their leaders of the liabilities and legal risks, strategic issues, money issues etc. that AI brings; demand specific definition of the needs that AI will fulfill at the workplace, instead of letting AI become the purpose instead of the tool. Age verification is racist and migrantphobic : Many people have issues with their ID, or have none, or are undocumented, and age verification in their country requires them to have contact with officials, police, etc. Age verification is transphobic : Relying on ID means many trans people are forced to reveal their deadname or are forced to come out, as it reveals they are trans if the ID is not or cannot be updated. The platforms are harmful, but we have so many ways and ideas against that that doesn't take away important spaces and support groups or bar entire groups of people. Age verification makes it possible for platforms to avoid working on their problems and becoming better, enables avoiding legislation and regulation, and enables control and surveillance by them; meanwhile, the truth is that you don't suddenly turn 16-18 and know how to handle porn, gore, harassment and all other negative parts of social media. The negative sides to social media that are named as the reason for age verification and banning of social media for specific age groups also affect adults negatively . We need to put more effort into education on how to handle these things. Yes, we can protect children's privacy by banning them off of platforms, but this also affects their other (digital and offline) rights, and privacy rights don't trump all . Children and teens should learn and be encouraged to control their own spaces and moderation via FOSS : Matrix, Mastodon, etc. where they can also seclude from adults and aren't reliant on Big Tech. Age verification and banning would take this away from them and also make it harder for FOSS projects. If children only ever enter the political discourse as victims, the only response can be rescue; that it why we have to make sure they enter as participants. Protection is not (just) space away from the risk, but confronting the systems that cause harm and eliminating them. 16-18% of US citizens report having engaged romantically with a bot, 45% of them said it made them feel more understood, 36% said it gave them stronger emotional support than their human partner. Problem: Current version of AI Act doesn't cover romantic and sexual use, no guidance for safeguards for emotionally responsive AI systems that protects around the risk of suicide, crimes, distress when service slows down or shuts down or model changes, discrimination as you get more if you pay etc.; drafts mention some of it now in Art. 50. With all the talk around becoming emotionally dependent on AI, nudging into harmful behaviors, etc. we cannot forget that you are also vulnerable on other services and in human romantic relationships, where the same routinely happens (weak argument, but to be fair, I also often forget this). We also cannot forget that it is not always a replacement - it often just supplements social life, and there are also surprisingly many people who just don't want or need romantic or sexual relations with a human ; they want bots specifically , and only bots. Disclosure agreements (meaning: labels everywhere that this is just a bot and not real) are most often useless, because people know and intentionally seek it out (exception for Insta/Snap DMs etc.) The latter about Human-AI intimacy was extra interesting because it had someone on the panel who directly works with people who use bots for romance and sex, and her experience has been mostly positive and that it helps her clients. Afterwards, I sadly was too overwhelmed, exhausted and in pain to continue and went back to the apartment to rest. Unfortunately, all the stress around the apartment and the generally more exhausting day triggered my digestive tract badly (Crohn's disease), but within the first few hours, all toilets in the venue were out of service due to an issue outside the venue or the organizer's control, and the alternative toilets were much further away. I didn't wanna have to deal with that with upset intestines. I missed the ' Designing Fairness ' Workshop, and the ' Consumer Rights at the age of acceleration' panel. Didn't meet anyone that day. Look at this ridiculous Gemini Photobooth they had that I saw no one use in the entire 3 days. This day, I managed to attend everything on my list, thankfully, as I felt a bit better. I attended: My takeaways/new things learned: The digital omnibus is mostly there to enable AI made in Europe to aid sovereignty and be competitive with US and China; AI here needs a framework to access data without much regulatory risk - that is what the EU Commission person said. Enforcing the law and and making it sharper is actually leveling the playing field and furthering innovation, because there is a massive power concentration of a handful companies that can do what they want, barely pay fines, have the fines suspended because of the US government bargaining with the EU, or who see them as a cost of doing business. Competition is impacted this way, as small companies are hit harder than the big ones. If the omnibus goes through with changing definitions of personal data etc., it will take years for case law, literature, standards etc. to catch up, it wastes money in companies who need to re-do everything to comply; so it doesn't simplify anything and makes praxis harder. You may set ChatGPT/Claude/Gemini etc. to not send feedback or training data in your settings, but when you react thumbs down/up to their request of whether the output was good or not, or choose between two different versions, the entire chat log until then gets sent for training and potential human review. So, these popup feedbacks override your settings . I need to read more papers by Theodore Christakis. Here is one of them. US and UK discovery and disclosure laws/principles go directly against EU data minimization principles; as long as data is relevant to a case it should be accessible, which is why in their cases, they can just have access to million's of people's data if necessary, and in a divorce case, they have the right to ask for AI chatlogs. There is no AI protection or privilege: If you use AI for legal stuff, you have no expectation of confidentiality like you would with a lawyer, so it is not safe from discovery. There is tension between tracking for harmful behavior/threats vs. data privacy rights ; what if someone threatens to kill themselves, kill others, etc.? Should company look for it, track it, report it, alert anyone, suspend the account, send help resources? Still unclear. There is also tension between people wanting the bonus features/ease of use coming from pesonalization and free services, while also not wanting to be tracked or charged. Advertisers see themselves as enablers of a good thing, as people want fitting ads, good algorithms, good suggestions, and free access; so if their business model is challenged or fails, people will have worse access and worse user experiences in their view. They also fear that if their business model is hindered, things will move into a more extreme, embedded, hard to avoid direction that you don't control or decide (Black Mirror ad type of stuff). I previously wrote about Consenter on the blog, and one panel had people from it there and showing screenshots; changed my mind on it a lot and made me understand the new features and goal better, I will probably write an update on it some time. We have different other options all covering something different about tracking, cookies, consent, or going about things differently, old and new: ADPC, GPC, ConStand, Global Privacy Control, DoNotTrack etc.; important for new stuff is granular consent, sent to the website, user given explanations etc. Uninformed decisions and bad practices lead to unfair competition ; bad actors erode trust level overall, so users resignate, experience fatigue and say yes in the same rates between "good" and "bad" services. Will read soon: Our data after us by the CNIL , and future release: Model rules on succession and access to digital remains by Eigenmann und Harbinja Digital remains can be split into assets (copyright, crypto, business tools, money), personal (messages, photos, identities, AI replicas), and third party data. GDPR only addresses living people; dead people's digital remains are subject to member state laws. There might be a need for something harmonized and European, though. For good digital hygiene , we should remember death and make it as easy as possible or sensible for the people we leave behind to get the access they need to manage our stuff how we want them to. Leave instructions, set emergency/legacy access when available (Google, Facebook, Instagram and Apple have it), include digital assets in your will, decide how your data is allowed to be used after death, especially around AI replicas. Hospice, nurses, families etc. should learn to ask affected parties about these things. Thanks to the focus on agentic AI, there is massive need for inference compute, which is super expensive. Almost all of it is in the control of, or can only be afforded by, the hyperscalers. At the same time, anything that seeks to enable or disable things for AI agents on the web can also affect accessibility programs like screen readers. It is in the best interest of the Big Tech companies to keep things individual, because it distracts from the collective issues and changes they'd have to do; it is easier to blame the person for agreeing to tracking than make sweeping changes to how much can be tracked. Individual consent doesn't consider the fact that data doesn't just affect you, but reveals things about your family, friends, partners, coworkers and more, as data is deeply interconnected. If your friend agrees to share his data and it also includes you, that is your data, still going to the service you'd have disagreed to. We as users have no collective bargaining tools yet; even big worker unions aren't negotiating with Microsoft about the terms of their employer using Microsoft Teams, when they actually should. We should also build up data unions made from users who bargain with the platforms. Strikes could look like boycotting the service, blocking trackers, scrambling data, massive amounts of access requests etc. Look into something called a Worker Data Trust ; this was used to prove Uber's predatory dynamic pricing (Worker's Info Exchange). Lots of workers made access requests, the data was combined and analyzed by researchers. After a failed attempt to meet up during lunch, I managed to meet up with another Country Reporter from noyb for a little while until the next panel happened, and sadly we didn't go to the same one. At this point, I was miffed about lunch at the conference. They made a big deal at registration about how the event will be mostly vegan and vegetarian to offset the climate impact of everyone traveling there, and they asked you to select your preference. I chose vegan. But for the entire three days, the food wasn't clearly labeled, some food was mislabeled as vegan when it wasn't, and there was way too little of it and wasn't restocked. It was more like "vegetarian snacks for birds". Vegan people had no warm food option at all, just sandwiches or wraps all three days that would have been enough for maybe 10 people. I mostly starved and I accidentally ate real cheese one time too because the food situation was so confusing. Here was one of the buffet menu cards, which were a bit to the side removed from the food, partially hidden by other stuff, and incorrect (anything with lactose is not vegan). I have no idea how, on a sea of silver platters with lots of bread, I am supposed to be able to differentiate the vegan gluten free bread option and the vegetarian gluten free bread that has scarmoza (italian cheese). It was a roundtable buffet, so everyone was waiting on you to hurry and grabbing stuff; I can't just grab bread and lift off the top to see the ingredients and then put it back, man. At least group the vegan stuff together or put labels directly in front of each thing. Also, while I am not reliant on gluten-free food, I think the people sensitive to it or having celiac disease don't appreciate that either. I skipped the Cocktail parties and big CPDP party, because it's not really feeling fun when you don't drink alcohol, have trouble just going up to people with your mask and hoping they hear you, and have no one to meet or go with. Last day was rather empty in the programme, so I arrived later and left earlier. I attended: My takeaways/new things learned: The AI warfare one was a bit of a letdown, because they all just accepted war as a right, an inevitable thing that has to happen. There was not even a nuance of fighting war itself, or banning AI weapons, etc; it focused more on the dual nature of the data , in which through surveillance, tracking, etc. not only can military use it to target people, NGO's and others can use it to warn, evacuate, render humanitarian aid etc. and document realities on the battlefield. There was also no possibility for the idea that we could enter an age where drones fight drones automatically and no one needs to get hurt or be traumatized or get to kill people like a game, and that is only because everyone is so attached to the idea that war has to have human casualties. It's hard to legislate and restrict because the data is taken from a whole ecosystem : Telecommunications, cloud services, civilian infrastructure, social media etc. and most of the data is collected during times of peace. Warfare is often explained with national security as a reason, which then again is a legitimate interest or fulfills other opening clauses in data protection and privacy laws. It is a problem that the richest men in the world, close to the US admin, lead the biggest companies worldwide, almost all in the US, and control almost all of AI and AI warfare. Project Maven from 2017 was continuously developed on and is now the Maven Smart System , which was used in Venezuela and Iran recently. Our Art. 15 GDPR right of access as it is right now is making up for Germany and Austria's lack of discovery and disclosure rights respectively. Controllers can usually drag stuff out, cite trade secrets and rights of others to evade data access, but the data subject barely has any power. Not having to justify the access request and it not having to be limited to data protection rights is good in this regard and needs to be kept up. Otherwise, also too much confusion and court cases whether a request was abusive or not if now, any request for a court case instead of privacy rights is deemed possibly abusive. We don't only need to focus on reidentification in general, but about the ability to single people's data out; you might not be able to identify them, but you can build a profile anyway. Learned about the term digital twin , or in terms of user data, a data twin that can be used for similation and is similar enough. AI-act-standards.com exists. Many don't know that the AI Act isn't a GDPR for AI, but serves more as market classification, as it sorts AI into different boxes who have to fulfill different requirements. The details of these requirements are/will be set with CEN/ISO standards and frameworks . You can see the progress of development on these standards on that website, and what they cover and how they interact. Hovering over the elements gives additional info. This is done by the JTC21 , and you can also get involved by registering with your national standardization body (in Germany, this is DIN) or when they do public consultations. Disabled people experience both extremes of AI - better accessibility options, often more reliant on AI, so also more subject to surveillance and having their privacy rights violated, while bad governments can use the data to harm disabled people, all under the guise of research. Marginalized groups are often the first trial group in anything, while not being stakeholders in the tech, or even invited to the table. See: AI used in immigration etc. and with deregulation and AI everywhere, we see a loss of reasonable suspicion thresholds in law enforcement and other groups. Learned about adversarial auditing . The previous two days, I did the whole fancy dress pants and blazer thing (one black blazer, one dark red/purple blazer), but for the last day and the drive home, I wore my Bearblog shirt and wide orange jeans: Someone from noyb staff thankfully recognized me and approached me, so we talked for a bit until he had to leave for another lunch meeting. That concludes the human contact I had. And then I left to drive home with my wife. She will hopefully soon write a guest post on my blog about how she navigates a new city in another country without mobile data/a smartphone (she has a tablet with WiFi only), because while I was at the conference, she explored the city on her own. It's kind of difficult to show up to these conferences as someone who isn't sent there for work, who doesn't have coworkers or ex-coworkers also attending, and who doesn't have much or any industry contacts yet. Most people there know each other from work or previous/other conferences, and I don't. These events are primarily for networking, keeping in touch, and talking about what you have seen and learned though. I couldn't discuss anything with anybody present, and it made me feel really lonely and silly. Just going up to people and striking up a conversation is not my strong suit, and it's something I am working on and has already gotten better, but the mask I am usually wearing in these big crowds and gatherings because I am on immunosuppressive medication is actively keeping me isolated. I know people have trouble understanding me, can't see me smiling at them, and think I am sick, so that keeps both sides hesitant. Unfortunately, if I attend next year, I will have to leave away the mask and maybe try out these protective sprays for nose and throat that are supposed to reduce viral load. It seems like you can only 'afford' to wear a mask if you are already in a group of people. Weeks before the event, I asked some people if they would attend, they said they will and we had a group chat of 10 to coordinate meetups. But during the entire conference, I was the only one trying to make something happen - saying where I am/where I will be, identifiers you could spot me with (as we never met before and you can't see name tags well on the lanyard), meeting points etc. and the two people mentioned were the only ones who took me up on it. The others just ghosted me/ignored my messages. That saddened me a lot during the conference. And unfortunately, these types of events are always really exhausting to me beyond the normal amount everyone experiences, because of things that trigger my conditions, my lower energy, my needs to lie down sometimes, sensory issues, food restrictions etc. so I really have to weigh if it's worth it to me. I'm not sure it is, without the social aspect. Many of the panels I chose had an issue of being not well organized. Instead of short speaker times, precise audience questions, interactions, dialogue, disagreements, different sides, answering the panel's topic and offering solutions etc., it often resulted in every speaker having a 10 minute monologue saying their peace, the other speakers not reacting or intervening because it's too much, everyone more or less saying the same thing or zoning out, and then having too little time to really give much attention to audience questions. Some gathered audience questions to answer them in batches and predictably, that resulted in nuance being lost and almost nothing being precisely answered. From many panels, I walked away with less learned than I wanted to, and just being reaffirmed in what everyone knew already. There were almost no further or new resources, or real takeaways of what the next steps should be and how we can tackle or solve an issue. They say " there should be more transparency " but not how we ask for it, how we legislate it, how it should happen. It's often just a vague " Someone should do more of something, and fast. " It was easy for people from the EU Commission to dodge mine and others' questions about the omnibus bullshit with no convincing answer. (: It disillusioned me a bit about my own goal to be speaking at a panel one day, because so often it felt like it was just there to platform someone to give them a chance to ramble and that's it, or just so that they can put this on their CV. Looking into the panelists, so many of them are genuinely great, very accomplished and admirable people with a lot of expertise, but the way things were set up, it couldn't shine through. You would have been better off talking to them directly. As a final bonus for reading this far, help me delete this (fortune) cookie. Reply via email Published 23 May, 2026 Contesting AI & Defending Democracy ; Possibilities for European AI Futures ( x ) Youth protection through inclusion and empowerment : a rebuttal of the exclusion-based narrative ( x ) Intimacy by Design: Governing Human AI relationships ( x ) Microsoft co-wrote parts of the EU's Energy Efficiency Directive , which allows data centers to keep their energy use confidential under the guise of business secrecy. The draft literally had paragraph's of Microsoft's proposal copied in unchanged. The Dutch government used racial/ethnic profiling via algorithms in the assessment of childcare benefit applications, which led to false allegations of fraud against thousands of families, particularly affecting those from ethnic minorities. I heard about this before, but learned more about it that day. To contest it all and defend democracy, we all need to train our AI literacy skills , support and have good tech journalism that questions and exposes it all (404media is, imo, a good example of what they meant), crafting and changing the social media narrative around AI and Big Tech, listening to affected people, demanding transparency via standards and audits etc. We cannot forget that officials know ; many of the effects we criticize are not accidents or side effects, they are the entire point. Like when tech predominately negatively targets marginalized communities, this is a bonus to people in power, and nothing to be fixed. Workers can resist by reminding their leaders of the liabilities and legal risks, strategic issues, money issues etc. that AI brings; demand specific definition of the needs that AI will fulfill at the workplace, instead of letting AI become the purpose instead of the tool. Age verification is racist and migrantphobic : Many people have issues with their ID, or have none, or are undocumented, and age verification in their country requires them to have contact with officials, police, etc. Age verification is transphobic : Relying on ID means many trans people are forced to reveal their deadname or are forced to come out, as it reveals they are trans if the ID is not or cannot be updated. The platforms are harmful, but we have so many ways and ideas against that that doesn't take away important spaces and support groups or bar entire groups of people. Age verification makes it possible for platforms to avoid working on their problems and becoming better, enables avoiding legislation and regulation, and enables control and surveillance by them; meanwhile, the truth is that you don't suddenly turn 16-18 and know how to handle porn, gore, harassment and all other negative parts of social media. The negative sides to social media that are named as the reason for age verification and banning of social media for specific age groups also affect adults negatively . We need to put more effort into education on how to handle these things. Yes, we can protect children's privacy by banning them off of platforms, but this also affects their other (digital and offline) rights, and privacy rights don't trump all . Children and teens should learn and be encouraged to control their own spaces and moderation via FOSS : Matrix, Mastodon, etc. where they can also seclude from adults and aren't reliant on Big Tech. Age verification and banning would take this away from them and also make it harder for FOSS projects. If children only ever enter the political discourse as victims, the only response can be rescue; that it why we have to make sure they enter as participants. Protection is not (just) space away from the risk, but confronting the systems that cause harm and eliminating them. 16-18% of US citizens report having engaged romantically with a bot, 45% of them said it made them feel more understood, 36% said it gave them stronger emotional support than their human partner. Problem: Current version of AI Act doesn't cover romantic and sexual use, no guidance for safeguards for emotionally responsive AI systems that protects around the risk of suicide, crimes, distress when service slows down or shuts down or model changes, discrimination as you get more if you pay etc.; drafts mention some of it now in Art. 50. With all the talk around becoming emotionally dependent on AI, nudging into harmful behaviors, etc. we cannot forget that you are also vulnerable on other services and in human romantic relationships, where the same routinely happens (weak argument, but to be fair, I also often forget this). We also cannot forget that it is not always a replacement - it often just supplements social life, and there are also surprisingly many people who just don't want or need romantic or sexual relations with a human ; they want bots specifically , and only bots. Disclosure agreements (meaning: labels everywhere that this is just a bot and not real) are most often useless, because people know and intentionally seek it out (exception for Insta/Snap DMs etc.) Simplification for Whom? Unpacking the Consumer Impact of the Digital Omnibus ( x ) My Chatbot, My Confidant: Protecting User Privacy in Generative AI Conversations ( x ) Informed consent: The breakthrough in Art. 88b GDPR / Digital Omnibus and current initiatives in the field of PIMS and technical standardisation ( x ) Digital Legacy Beyond GDPR: Succession, Data Protection, Access Rights, and Platform Power ( x ) The Agentic Assistant: What does Big Tech’s goal of creating a universal digital intermediary mean for society? ( x ) Designing Collective Technology Governance ( x ) The digital omnibus is mostly there to enable AI made in Europe to aid sovereignty and be competitive with US and China; AI here needs a framework to access data without much regulatory risk - that is what the EU Commission person said. Enforcing the law and and making it sharper is actually leveling the playing field and furthering innovation, because there is a massive power concentration of a handful companies that can do what they want, barely pay fines, have the fines suspended because of the US government bargaining with the EU, or who see them as a cost of doing business. Competition is impacted this way, as small companies are hit harder than the big ones. If the omnibus goes through with changing definitions of personal data etc., it will take years for case law, literature, standards etc. to catch up, it wastes money in companies who need to re-do everything to comply; so it doesn't simplify anything and makes praxis harder. You may set ChatGPT/Claude/Gemini etc. to not send feedback or training data in your settings, but when you react thumbs down/up to their request of whether the output was good or not, or choose between two different versions, the entire chat log until then gets sent for training and potential human review. So, these popup feedbacks override your settings . I need to read more papers by Theodore Christakis. Here is one of them. US and UK discovery and disclosure laws/principles go directly against EU data minimization principles; as long as data is relevant to a case it should be accessible, which is why in their cases, they can just have access to million's of people's data if necessary, and in a divorce case, they have the right to ask for AI chatlogs. There is no AI protection or privilege: If you use AI for legal stuff, you have no expectation of confidentiality like you would with a lawyer, so it is not safe from discovery. There is tension between tracking for harmful behavior/threats vs. data privacy rights ; what if someone threatens to kill themselves, kill others, etc.? Should company look for it, track it, report it, alert anyone, suspend the account, send help resources? Still unclear. There is also tension between people wanting the bonus features/ease of use coming from pesonalization and free services, while also not wanting to be tracked or charged. Advertisers see themselves as enablers of a good thing, as people want fitting ads, good algorithms, good suggestions, and free access; so if their business model is challenged or fails, people will have worse access and worse user experiences in their view. They also fear that if their business model is hindered, things will move into a more extreme, embedded, hard to avoid direction that you don't control or decide (Black Mirror ad type of stuff). I previously wrote about Consenter on the blog, and one panel had people from it there and showing screenshots; changed my mind on it a lot and made me understand the new features and goal better, I will probably write an update on it some time. We have different other options all covering something different about tracking, cookies, consent, or going about things differently, old and new: ADPC, GPC, ConStand, Global Privacy Control, DoNotTrack etc.; important for new stuff is granular consent, sent to the website, user given explanations etc. Uninformed decisions and bad practices lead to unfair competition ; bad actors erode trust level overall, so users resignate, experience fatigue and say yes in the same rates between "good" and "bad" services. Will read soon: Our data after us by the CNIL , and future release: Model rules on succession and access to digital remains by Eigenmann und Harbinja Digital remains can be split into assets (copyright, crypto, business tools, money), personal (messages, photos, identities, AI replicas), and third party data. GDPR only addresses living people; dead people's digital remains are subject to member state laws. There might be a need for something harmonized and European, though. For good digital hygiene , we should remember death and make it as easy as possible or sensible for the people we leave behind to get the access they need to manage our stuff how we want them to. Leave instructions, set emergency/legacy access when available (Google, Facebook, Instagram and Apple have it), include digital assets in your will, decide how your data is allowed to be used after death, especially around AI replicas. Hospice, nurses, families etc. should learn to ask affected parties about these things. Thanks to the focus on agentic AI, there is massive need for inference compute, which is super expensive. Almost all of it is in the control of, or can only be afforded by, the hyperscalers. At the same time, anything that seeks to enable or disable things for AI agents on the web can also affect accessibility programs like screen readers. It is in the best interest of the Big Tech companies to keep things individual, because it distracts from the collective issues and changes they'd have to do; it is easier to blame the person for agreeing to tracking than make sweeping changes to how much can be tracked. Individual consent doesn't consider the fact that data doesn't just affect you, but reveals things about your family, friends, partners, coworkers and more, as data is deeply interconnected. If your friend agrees to share his data and it also includes you, that is your data, still going to the service you'd have disagreed to. We as users have no collective bargaining tools yet; even big worker unions aren't negotiating with Microsoft about the terms of their employer using Microsoft Teams, when they actually should. We should also build up data unions made from users who bargain with the platforms. Strikes could look like boycotting the service, blocking trackers, scrambling data, massive amounts of access requests etc. Look into something called a Worker Data Trust ; this was used to prove Uber's predatory dynamic pricing (Worker's Info Exchange). Lots of workers made access requests, the data was combined and analyzed by researchers. Data-driven warfare : AI, civilian risks, and corporate responsibility ( x ) Digital Omnibus meets the Charter of Fundamental Rights ( x ) Toward a Standard for Fair AI-driven Recruitment ( x ) Data protection law as a shield, not a weapon: empowering historically marginalized communities in the EU in times of de-regulation ( x ) -> this choice was especially rough, because I was also very interested in ' The U.S. Deregulatory Effect ' happening elsewhere at the same time The AI warfare one was a bit of a letdown, because they all just accepted war as a right, an inevitable thing that has to happen. There was not even a nuance of fighting war itself, or banning AI weapons, etc; it focused more on the dual nature of the data , in which through surveillance, tracking, etc. not only can military use it to target people, NGO's and others can use it to warn, evacuate, render humanitarian aid etc. and document realities on the battlefield. There was also no possibility for the idea that we could enter an age where drones fight drones automatically and no one needs to get hurt or be traumatized or get to kill people like a game, and that is only because everyone is so attached to the idea that war has to have human casualties. It's hard to legislate and restrict because the data is taken from a whole ecosystem : Telecommunications, cloud services, civilian infrastructure, social media etc. and most of the data is collected during times of peace. Warfare is often explained with national security as a reason, which then again is a legitimate interest or fulfills other opening clauses in data protection and privacy laws. It is a problem that the richest men in the world, close to the US admin, lead the biggest companies worldwide, almost all in the US, and control almost all of AI and AI warfare. Project Maven from 2017 was continuously developed on and is now the Maven Smart System , which was used in Venezuela and Iran recently. Our Art. 15 GDPR right of access as it is right now is making up for Germany and Austria's lack of discovery and disclosure rights respectively. Controllers can usually drag stuff out, cite trade secrets and rights of others to evade data access, but the data subject barely has any power. Not having to justify the access request and it not having to be limited to data protection rights is good in this regard and needs to be kept up. Otherwise, also too much confusion and court cases whether a request was abusive or not if now, any request for a court case instead of privacy rights is deemed possibly abusive. We don't only need to focus on reidentification in general, but about the ability to single people's data out; you might not be able to identify them, but you can build a profile anyway. Learned about the term digital twin , or in terms of user data, a data twin that can be used for similation and is similar enough. AI-act-standards.com exists. Many don't know that the AI Act isn't a GDPR for AI, but serves more as market classification, as it sorts AI into different boxes who have to fulfill different requirements. The details of these requirements are/will be set with CEN/ISO standards and frameworks . You can see the progress of development on these standards on that website, and what they cover and how they interact. Hovering over the elements gives additional info. This is done by the JTC21 , and you can also get involved by registering with your national standardization body (in Germany, this is DIN) or when they do public consultations. Disabled people experience both extremes of AI - better accessibility options, often more reliant on AI, so also more subject to surveillance and having their privacy rights violated, while bad governments can use the data to harm disabled people, all under the guise of research. Marginalized groups are often the first trial group in anything, while not being stakeholders in the tech, or even invited to the table. See: AI used in immigration etc. and with deregulation and AI everywhere, we see a loss of reasonable suspicion thresholds in law enforcement and other groups. Learned about adversarial auditing .

0 views
Susam Pal 1 weeks ago

Don't Roll Your Own ...

This is going to be a rant about modern web design practices. But before I get to that, let me begin with a familiar principle from the world of cryptography. Among software developers, and especially among those who work on security-sensitive systems, there is a well-known maxim: Don't roll your own crypto. This does not mean that nobody is allowed to write cryptographic code. Someone has to. It means that, for ordinary production software that protects sensitive data of users, we should not rely on a private, unreviewed implementation that has not been vetted by the wider software development community. We should use established, vetted software packages or tools wherever possible. Fortunately, it is now standard industry practice to avoid rolling your own crypto and instead use cryptographic algorithms and packages that have been peer reviewed and stood the test of time. It wasn't so some twenty years ago. I have seen several flawed home-grown RC4 implementations early in my career, with issues like improper initialisation vectors, predictable keystreams and partial leakage of plaintext into ciphertext, putting sensitive data of users at risk. But today, major e-commerce websites or banks typically do not use home-grown cryptography for its web services. In fact, in regulated domains such as payments, healthcare and personal data processing, doing so could violate requirements for strong cryptography, possibly leading to hefty financial penalties. Website design is obviously not cryptography. A broken scroll bar is not the same kind of failure as a broken encryption scheme. But I wish there were a similar maxim for website design as well. There are many aspects of websites where, I think, developers should not be rolling their own X, especially when X is something browsers already do well and something users depend on every day. Here I present a list of such X. Of course, there are valid scenarios where you may need to roll your own X. But here I want to focus on the cases where you should not roll your own X, and how doing so can lead to a worse user experience, at least in my experience. I am not saying that nobody should ever build anything themselves. As someone who does a lot of creative computing myself and develops fun tools from time to time, I am a big proponent of developing your own stuff. But when it comes to developing user interface features for serious websites that people need to use to get their work done, I wish the software development community were more conservative in deciding what fancy feature goes into a website and what is left out. Do keep in mind that I am no expert in user experience. Far from it. So none of what I am saying here should be taken as a recommendation. But I am a user of the Web, and as a user, I have found some modern web design patterns to be frustrating. This post is a lament from one user of the Web, not a design guide. Of all the things I mentioned above, the one that bothers me the most is custom scroll behaviour on websites. I am used to how page scrolling responds to my mouse, touchpad or keyboard input. When you override the default scrolling behaviour of the web browser with your own implementation, it 'breaks' the page for me. The page now moves too slowly or too quickly when I scroll. Keyboard scrolling may or may not work. You take something I am so familiar with that I don't even think about it, and turn it into something unfamiliar that I now have to think about. Custom link navigation is another pet peeve of mine. Web browsers can already handle links very well. You could say that this is the whole reason web browsers even exist. Following links is their bread and butter. You shouldn't have to mess with that behaviour at all. If you think you need to, reconsider what you are trying to achieve and whether it is really so important as to disrupt normal link navigation. The worst offender I have found here is GitHub. When you click on a link on GitHub, say, a file link or an issue link, it triggers a massive piece of functionality implemented in JavaScript that handles the link click for you. If you don't believe me, visit your favourite project on GitHub using Firefox or Chrome, type F12 to open the browser's developer tools, then go to the 'Debugger' or 'Sources' tab, find 'Event Listener Breakpoints' on the right sidebar, expand 'Mouse' and select 'click'. Then click on a link on GitHub and see what happens. I'm sure I am not the only one who has noticed that, on GitHub, a clicked link sometimes takes too long to load. Ironically, it is often faster to open the link in a new tab than to wait for GitHub's JavaScript code to handle the navigation in the current tab. A custom password input field is another such hazard. Fortunately, custom password input fields have become rarer over the years. The password input field that comes with the web browser is generally well equipped to handle passwords. It can offer to save passwords, fill them in later and generate strong passwords for new accounts. It can also warn when a password is submitted over an insecure HTTP connection, work well with password managers and autofill, and cooperate with mobile keyboards and accessibility tools. If you replace the browser's password field with your own fake version, you may break all of that. You may also end up using an ordinary text field and masking it yourself, in which case the password may be treated by the browser, the operating system or assistive tools as ordinary visible text rather than as a password, thereby exposing the password in ways you did not intend. Custom date pickers are another common annoyance. I know that does not help you select a date range. But that is okay. You can provide two date input fields, one for the start date and one for the end date. I am willing to pay the small price of using two different inputs to select a date range if that means I can use my favourite web browser to navigate the calendar and select dates the same way everywhere. What I am less inclined to do is to learn ten different ways of using the date selector in ten different implementations across ten different websites. Right now the implementations of date selector are all over the place. Some require you to zoom out of the month view to enter a year view, where you can select years. While you are there, you cannot change the month again until you return to the month view. Some require you to click the previous-year button literally forty times to select your year of birth if you are old enough. Some do not let you type the date at all. No. I do not want to learn your calendar widget. I just want to use the date picker in my favourite browser, which is quite sane. Saner than your custom implementation. If you need to have a calendar widget to support browsers with inadequate native date-picker support, perhaps that support can be added alongside the native date picker rather than as a replacement for it. For example, the ordinary element could be left intact, with a custom widget provided in addition to it so that users can manipulate the same field. In general, just stop messing with the form controls. They almost always introduce new problems while solving some existing ones. And while you are at it, don't keep changing your website layout and interface every few months! I may adapt to the new design, but my ageing relatives cannot. For them, every time you change the user interface, it amounts to learning a whole new tool. If every website keeps doing this every few months, they have to spend a significant amount of time relearning familiar things for no functional benefit. Please just let them enjoy their retirement. Imagine how you would feel if a Linux distribution decided to redesign all its core commands and their command-line options every few months. Or imagine how you would feel if the buttons of your washing machine were rearranged every morning. It wouldn't be pleasant! Read on website | #web | #technology Don't roll your own page scrolling. Don't roll your own link navigation. Don't roll your own text selection. Don't roll your own context menu. Don't roll your own copy and paste. Don't roll your own password field. Don't roll your own date picker.

0 views

Lawmakers Demand Answers as CISA Tries to Contain Data Leak

Lawmakers in both houses of Congress are demanding answers from the U.S. Cybersecurity & Infrastructure Security Agency (CISA) after KrebsOnSecurity reported this week that a CISA contractor intentionally published AWS GovCloud keys and a vast trove of other agency secrets on a public GitHub account. The inquiry comes as CISA is still struggling to contain the breach and invalidate the leaked credentials. On May 18, KrebsOnSecurity reported that a CISA contractor with administrative access to the agency’s code development platform had created a public GitHub profile called “ Private-CISA ” that included plaintext credentials to dozens of internal CISA systems. Experts who reviewed the exposed secrets said the commit logs for the code repository showed the CISA contractor disabled GitHub’s built-in protection against publishing sensitive credentials in public repos. CISA acknowledged the leak but has not responded to questions about the duration of the data exposure. However, experts who reviewed the now-defunct Private-CISA archive said it was originally created in November 2025, and that it exhibits a pattern consistent with an individual operator using the repository as a working scratchpad or synchronization mechanism rather than a curated project repository. In a written statement, CISA said “there is no indication that any sensitive data was compromised as a result of the incident.” But in a May 19 a letter (PDF) to CISA’s Acting Director Nick Andersen , Sen. Maggie Hassan (D-NH) said the credential leak raises serious questions about how such a security lapse could occur at the very agency charged with helping to prevent cyber breaches. “This reporting raises serious concerns regarding CISA’s internal policies and procedures at a time of significant cybersecurity threats against U.S. critical infrastructure,” Sen. Hassan wrote. A May 19 letter from Sen. Margaret Hassan (D-NH) to the acting director of CISA demanded answers to a dozen questions about the breach. Sen. Hassan noted that the incident occurred against the backdrop of major disruptions internally at CISA, which lost more than a third of it workforce and almost all of its senior leaders after the Trump administration forced a series of early retirements, buyouts, and resignations across the agency’s various divisions. Rep. Bennie Thompson (D-MS), the ranking member on the House Homeland Security Committee, echoed the senator’s concerns. “We are concerned that this incident reflects a diminished security culture and/or an inability for CISA to adequately manage its contract support,” Thompson wrote in a May 19 letter to the acting CISA chief that was co-signed by Rep. Delia Ramirez (D-Ill), the ranking member of the panel’s Subcommittee on Cybersecurity and Infrastructure Protection. “It’s no secret that our adversaries — like China, Russia, and Iran — seek to gain access to and persistence on federal networks. The files contained in the ‘Private-CISA’ repository provided the information, access, and roadmap to do just that.” KrebsOnSecurity has learned that more a week after CISA was first notified of the data leak by the security firm GitGuardian , the agency is still working to invalidate and replace many of the exposed keys and secrets. On May 20, KrebsOnSecurity heard from Dylan Ayrey , the creator of TruffleHog , an open-source tool for discovering private keys and other secrets buried in code hosted at GitHub and other public platforms. Ayrey said CISA still hadn’t invalidated an RSA private key exposed in the Private-CISA repo that granted access to a GitHub app which is owned by the CISA enterprise account and installed on the CISA-IT GitHub organization with full access to all code repositories. “An attacker with this key can read source code from every repository in the CISA-IT organization, including private repos, register rogue self-hosted runners to hijack CI/CD pipelines and access repository secrets, and modify repository admin settings including branch protection rules, webhooks, and deploy keys,” Ayrey told KrebsOnSecurity. CI/CD stands for Continuous Integration and Continuous Delivery, and it refers to a set of practices used to automate the building, testing and deployment of software. KrebsOnSecurity notified CISA about Ayrey’s findings on May 20. Ayrey said CISA appears to have invalidated the exposed RSA private key sometime after that notification. But he noted that CISA still hasn’t rotated leaked credentials tied to other critical security technologies that are deployed across the agency’s technology portfolio (KrebsOnSecurity is not naming those technologies publicly for the time being). CISA responded with a brief written statement in response to questions about Ayrey’s findings, saying “CISA is actively responding and coordinating with the appropriate parties and vendors to ensure any identified leaked credentials are rotated and rendered invalid and will continue to take appropriate steps to protect the security of our systems.” Ayrey said his company Truffle Security monitors GitHub and a number of other code platforms for exposed keys, and attempts to alert affected accounts to the sensitive data exposure(s). They can do this easily on GitHub because the platform publishes a live feed which includes a record of all commits and changes to public code repositories. But he said cybercriminal actors also monitor these public feeds, and are often quick to pounce on API or SSH keys that get inadvertently published in code commits. The Private-CISA GitHub repo exposed dozens of plaintext credentials to important CISA GovCloud resources. In practical terms, it is likely that cybercrime groups or foreign adversaries also noticed the publication of these CISA secrets, the most egregious of which appears to have happened in late April 2026, Ayrey said. “We monitor that firehose of data for keys, and we have tools to try to figure out whose they are,” he said. “We have evidence attackers monitor that firehose as well. Anyone monitoring GitHub events could be sitting on this information.” James Wilson , the enterprise technology editor for the Risky Business security podcast, said organizations using GitHub to manage code projects can set top-down policies that prevent employees from disabling GitHub’s protections against publishing secret keys and credentials. But Wilson’s co-host Adam Boileau said it’s not clear that any technology could stop employees from opening their own personal GitHub account and using it to store sensitive and proprietary information. “Ultimately, this is a thing you can’t solve with a technical control,” Boileau said on this week’s podcast . “This is a human problem where you’ve hired a contractor to do this work and they have decided of their own volition to use GitHub to synchronize content from a work machine to a home machine. I don’t know what technical controls you could put in place given that this is being done presumably outside of anything CISA managed or even had visibility on.” Update, 3:05 p.m. ET: Added statement from CISA. Corrected a date in the story (Truffle Security said it found the repo gained some of its most sensitive secrets in late April 2026, not 2025).

0 views

Alleged Kimwolf Botmaster ‘Dort’ Arrested, Charged in U.S. and Canada

Canadian authorities on Wednesday arrested a 23-year-old Ottawa man on suspicion of building and operating Kimwolf , a fast spreading Internet-of-Things botnet that enslaved millions of devices for use in a series of massive distributed denial-of-service (DDoS) attacks over the past six months. KrebsOnSecurity publicly named the suspect in February 2026 after the accused launched a volley of DDoS, doxing and swatting campaigns against this author and a security researcher. He now faces criminal hacking charges in both Canada and the United States. A criminal complaint unsealed today in an Alaska district court charges Jacob Butler , a.k.a. “ Dort ,” of Ottawa, Canada with operating the Kimwolf DDoS botnet. A statement from the Department of Justice says the complaint against Butler was unsealed following the defendant’s arrest in Canada by the Ontario Provincial Police pursuant to a U.S. extradition warrant. Butler is currently in Canadian custody awaiting an initial court hearing scheduled for early next week. The government said Kimwolf targeted infected devices which were traditionally “firewalled” from the rest of the internet, such as digital photo frames and web cameras. The infected systems were then rented to other cybercriminals, or forced to participate in record-smashing DDoS attacks, as well as assaults that affected Internet address ranges for the Department of Defense . Consequently, the DoD’s Defense Criminal Investigative Service is investigating the case, with assistance from the FBI field office in Anchorage. “KimWolf was tied to DDoS attacks which were measured at nearly 30 Terabits per second, a record in recorded DDoS attack volume,” the Justice Department statement reads. “These attacks resulted in financial losses which, for some victims, exceeded one million dollars. The KimWolf botnet is alleged to have issued over 25,000 attack commands.” On March 19, U.S. authorities joined international law enforcement partners in seizing the technical infrastructure for Kimwolf and three other large DDoS botnets — named Aisuru , JackSkid and Mossad — that were all competing for the same pool of vulnerable devices. On February 28, KrebsOnSecurity identified Butler as the Kimwolf botmaster after digging through his various email addresses, registrations on the cybercrime forums, and posts to public Telegram and Discord servers. However, Dort continued to threaten and harass researchers who helped track down his real-life identity and dramatically slow the spread of his botnet. Dort claimed responsibility for at least two swatting attacks targeting the founder of Synthient , a security startup that helped to secure a widespread critical security weakness that Kimwolf was using to spread faster and more effectively than any other IoT botnet out there. Synthient was among many technology companies thanked by the Justice Department today, and Synthient’s founder Ben Brundage told KrebsOnSecurity he’s relieved Butler is in custody. “Hopefully this will end the harassment,” Brundage said. An excerpt from the criminal complaint against Butler, detailing how he ordered a swatting attack against Ben Brundage, the founder of the security firm Synthient. The government says investigators connected Butler to the administration of the KimWolf botnet through IP address, online account information, transaction records, and online messaging application records obtained through the issuance of legal process. The criminal complaint against Butler (PDF) shows he did little to separate his real-life and cybercriminal identities (something we demonstrated in our February unmasking of Dort). In April, the Justice Department joined authorities across Europe in seizing domain names tied to nearly four-dozen DDoS-for-hire services, although because of a bureaucratic mix-up the list of seized domains has remain sealed until today. The DOJ said at least one of those services collaborated with Butler’s Kimwolf botnet. A statement from the Ontario Provincial Police said a search warrant was executed on March 19 at Butler’s address in Ottawa, where they seized multiple devices. As a result of that investigation, Butler was arrested and charged this week with unauthorized user of computer; possession of device to obtain unauthorized use of computer system or to commit mischief; and mischief in relation to computer data. He is scheduled to remain in custody until a hearing on May 26. In the United States, Butler is facing one count of aiding and abetting computer intrusion. If extradited, tried and convicted in a U.S. court, Butler could face up to 10 years in prison, although that maximum sentence would likely be heavily tempered by considerations in the U.S. Sentencing Guidelines, which make allowances for mitigating factors such as youth, lack of criminal history and level of cooperation with investigators.

0 views
James Stanley 1 weeks ago

How to publish your secrets on Docker Hub

This week I have been looking inside public Docker images, with the aim of finding API keys etc. inside, and then reporting them and claiming bug bounties. It has been a partial success, in the sense that I found loads of private credentials inside public Docker images, and a partial failure, in the sense that I have not (yet?) received any bug bounties. There is an article on this kind of thing from flare.io in December . Feroz pointed out that all of the low-hanging fruit will have been picked already, and the remaining intersection between companies that leak secrets on Docker Hub, and companies that pay bug bounties, will be approximately 0. To do this work I built a tool to automatically pull down the latest pushed images on Docker Hub and grep them for secrets. I'm not releasing this because of the obvious potential for abuse. But I have released a public Docker Explorer tool for looking inside images manually. It's kind of surprising that Docker Hub doesn't have this kind of thing built-in. (Btw, pulling down lots of Docker images is very disk-intensive and my tool is very much vibe-coded, so it is possible that it will fall over soon, sorry). It lets you put in a public Docker image and look at the Dockerfile directives that built it, as well as the file contents of each layer (even if later deleted), extracts .zip and .jar files, and lets you explore bundled git repositories with gitweb . Docker Explorer is hosted on exe.dev . My brief review of exe.dev is that it is refreshingly geek-friendly, allowing configuration over SSH as well as the web interface. The billing model is a flat monthly fee for resources allocated, regardless of how many VMs you attach to them, which means you avoid the "surprise bankruptcy via AWS" scenario, and you also avoid paying another $10/mo every time you want to add a new VM. It automatically acquires TLS certificates for you, which is very convenient. The biggest downside is that as far as I can tell it only supports HTTP, you can't just run random other services and expose them to the internet. So it would be no good for hosting Protohackers solutions for example. Also no good for hosting a mail server, DNS server, IRC server, etc.; it's only for websites. From looking in public Docker images so far I have come across: AWS keys Google Cloud keys SSH keys Stripe keys GitHub access tokens GitHub passwords OpenAI/Anthropic/OpenRouter API keys SMTP passwords Telegram bot tokens MongoDB passwords Postgres passwords And an extremely long tail of API keys for various services I've never heard of before In many cases these seem to be included accidentally (e.g. a developer had the credentials on their local disk when they built the image and didn't realise they would be copied into it), but in probably most cases I think people put them in the image on purpose, to use them, but didn't realise that the image would be public! There is kind of a footgun with the Docker Hub free tier where it only lets you have one private image, and if you push any more images then they are just automatically public. So obviously watch out for that. Follows a list of ways to publish these things on Docker Hub. Hard-code the secrets into your source code If you're looking to accidentally publish secrets, then you should be doing this already. Hard-coding secrets in the source code means you get to publish them in both your git repository and your container image without any extra work. Put them in a .env file Preferably you will commit the .env file to git so as to increase the attack surface. Putting secrets in a .env file makes them particularly easy to find because you can find them just by looking at filenames, without having to grep over the entire codebase. But even if you don't commit them to git, if you put them in the Docker image with "COPY . ." then they will get included anyway if present on your local machine when you build the image. Put them in the Dockerfile Dockerfile : This does successfully avoid writing the secret to the image filesystem , but it is easy to see that the information is still there , otherwise your daemon wouldn't be able to read it. And in fact the environment variables are straightforwardly stored in the JSON metadata of the image. ARG is similar but for values that are only present while building the image, rather than running it. These also leak into the image metadata, so I would also suggest putting secrets in ARG directives if you want to leak them. Delete them at build time Dockerfile : If you docker exec -it --rm image bash then you'll find that /root/.ssh/id_rsa has indeed been deleted. But because Docker builds up a container image as a series of "layers" that are applied on top of one another, you are free to extract the content at the layer created by the "COPY" line, and grab out the private SSH key. Docker Build secrets documentation has suggestions for what to do if you don't want to leak credentials in your public images. Hide them with .dockerignore .dockerignore : Now when you copy your working directory into the Docker image with COPY . . , your .env file will be ignored. Boo! But your .git directory will still be included, so if .env was committed to git then it will still be accessible via the .git directory. Leave them in .git/config .git/config : Including your .git directory in the image not only leaks your entire git repository contents, it also leaks the URLs to your remotes (typically just an "origin" on github), which you may want to keep private, and credentials if you have configured any. Even if your project is open source and your git repository is public, your .git/config may contain secrets that you don't want to be made public. Namely, your github credentials. When the image is built using the GitHub actions/checkout to clone the repository, it will be a "shallow clone" (i.e. only contains the most recent commit), and will contain a GitHub token which expires when the job finishes, so will be already revoked by the time you see it. The most recent commit still contains the committer name and email address as well as the commit message, so for a private repo it's still worth including if your goal is to leak secrets. I'd recommend always bundling .git into the image, because you never know, it might work. Finally: never check Having built a Docker image, never check it to see if there is anything inside that you didn't expect, that way you won't have to find out if you leaked any secrets and you can sleep easily. What to actually do, real talk Obviously, do the opposite of all of this! Don't commit secrets to git. Don't put .env files containing secrets into your Docker image. That much is obvious. Less obvious is don't put secrets in the Dockerfile. Don't put secrets into the image and then delete them later on. Don't copy the .git directory into the image. And maybe glance over your public images on Docker Explorer to check that you aren't leaking anything. Google Cloud keys Stripe keys GitHub access tokens GitHub passwords OpenAI/Anthropic/OpenRouter API keys SMTP passwords Telegram bot tokens MongoDB passwords Postgres passwords And an extremely long tail of API keys for various services I've never heard of before

0 views
Xe Iaso 1 weeks ago

"No way to prevent this" say users of only language where this regularly happens

In the hours following the release of CVE-2026-45584 for the project Microsoft Windows , site reliability workers and systems administrators scrambled to desperately rebuild and patch all their systems to fix a memory safety vulnerability resulting in arbitrary code execution inside the virus scanner Windows Defender. This is due to the affected components being written in C++, the only programming language where these vulnerabilities regularly happen. "This was a terrible tragedy, but sometimes these things just happen and there's nothing anyone can do to stop them," said programmer Dr. Annabelle Connelly, echoing statements expressed by hundreds of thousands of programmers who use the only language where 90% of the world's memory safety vulnerabilities have occurred in the last 50 years, and whose projects are 20 times more likely to have security vulnerabilities. "It's a shame, but what can we do? There really isn't anything we can do to prevent memory safety vulnerabilities from happening if the programmer doesn't want to write their code in a robust manner." At press time, users of the only programming language in the world where these vulnerabilities regularly happen once or twice per quarter for the last eight years were referring to themselves and their situation as "helpless."

0 views
Blargh 1 weeks ago

Everything in C is undefined behavior

If he had been a programmer, Cardinal Richelieu would have said “Give me six lines written by the hand of the most expert C programmer in the world, and I will find enough in them to trigger undefined behavior”. Nobody can write correct C, or C++. And I say that as someone who’s written C and C++ on an almost daily basis for about 30 years. I listen to C++ podcasts. I watch C++ conference talks. I enjoy reading and writing C++. C++ has served us well, but it’s 2026, and the environment of 1985 (C++) or 1972 (C) is not the environment of today. I’m definitely not the first to say this. I remember reading a post by someone prominent about a decade ago saying that a good case can be made that use of C++ is a SOX violation. And while I was not onboard with the rest of their rant (nor their confusion about “its” vs “it’s”), I never disagreed about that point. With time I found it to be more and more true. WAY more things are undefined behavior (UB) than you’d expect. Everyone knows that double-free, use after free, accessing outside the bounds of an object (e.g. array), and accessing uninitialized memory is UB. After all, C/C++ is not a memory safe language. And yet we as an industry seem to be unable to stop making even those mistakes over and over. But there’s more. More subtle. More illogical. Some people seem to think that as long as they don’t compile with optimizations turned on, undefined behavior can’t hurt them. They believe that the compiler is somehow being deliberately hostile, going “AHA! UB! I can do whatever I want here!”, and without optimizations turned on it won’t. This is incorrect. UB doesn’t mean that the compiler can take advantage of your sloppiness. UB means that the compiler can assume that your code is valid. It means that the intention of your code that’s oh so obvious when read by a human, doesn’t even have a way to be expressed between compiler stages or modules. UB means that the compiler doesn’t even have to implement some special cases in its code generation, because they “can’t happen”. The compiler, and really the underlying hardware too, is playing a game of telephone with your UB intentions. It may end up with what you wanted, but there’s no guarantee for now or in the future. The following is not an attempt at enumerating all the UB in the world. It’s merely making the case that UB is everywhere, and if nobody can do it right, how is it even fair to blame the programmer? My point is that ALL nontrivial C/C++ code has UB. As an example of this, take this code: If this function is called with a pointer not correctly aligned (probably meaning on an address that’s a multiple of , but who knows), this is UB. C23 6.3.2.3. On Linux Alpha, in some cases this would merely trap to the kernel, which would software emulate what you intended. In other cases it would (probably) crash your program with a SIGBUS. On SPARC it would cause a SIGBUS. Sure, on x86/amd64 (henceforth just “x86”) this is likely fine. Hell, it’s probably even an atomic read. x86 is famously extremely forgiving about cache coherency subtleties. So here we have three cases: What about ARM, RISC-V, and others? What about future architectures? A future architecture could even have special that do not populate the lowest bits, because such pointers cannot exist. Even if it works, maybe the compiler one day changes from using one load instruction to another, and suddenly that’s no longer fixed up by the kernel. Because the compiler is not obligated to generate assembly instructions that work on unaligned pointers . Because it’s UB. Or how about this: Is this operation atomic when the object is not correctly aligned? That’s the wrong question to ask. Mu , unask the question. It’s UB. (but also yes, in practice this can easily be an atomicity problem) If you want to get even more convinced, you can try thinking about what happens if an object you thought you were reading atomically spans pages . But don’t think too much about it, or you may conclude that “it’s fine”. It’s not. It’s UB. Don’t blame the function, above. The act of dereferencing the pointer wasn’t the problem. Merely creating the pointer was enough to be a problem. That cast is the problem, not . It’s perfectly valid for the compiler to assign specific meaning, such as garbage collection or security tagging bits, to the lower bits of an . is a simple function that takes a character and returns if it’s a hex digit. 0-9 or a-f. It can also take the value . Uh, ok. What value is ? Per C23 7.4p1 we know it’s an , and we can infer that it’s not representable by . therefore takes an , not a . All values of fit inside , so we should be fine. Casting from to fits, so per section 6.3.1.3 we’re fine, right? No. Because if is called with a value other than 0-127, and on your architecture is (implementation defined, per 6.2.5, paragraph 20 in C23 ), then the integer value ends up negative. And the following is a valid implementation of , that would cause a read of who-knows-what memory. It could even be I/O mapped memory, triggering things to happen that is more than merely getting a random value or crash. It could cause the motor to start. Less likely in an application running in a desktop operating system than in an embedded system, sure. But there are user space network drivers (for performance), so even user space won’t protect you. And, by omission, it’s also UB if the float is a non-finite value. So how do you compare a float to ? Do you cast the float to ? No, that’s the UB you want to avoid. So you cast to float? How do you know it can be represented exactly? Maybe casting to rounds to a value not representable in , and your comparison becomes non-representative? Maybe the following works? You’ll miss out on representing some really high values, but maybe that’s OK? I just wanted to convert a float to an int. :-( I bet there’s lots of code out there that take a value in seconds, and convert it to integer milliseconds, by just multiplying and casting. Most programmers won’t have to deal with this, but I don’t think there’s any C standards compliant way in practice to put an object at address zero. This can come up in OS kernel and embedded coding. By 6.3.2.3 an integer constant zero (which is convertible to a pointer) and are the “null pointer constant” (which I’ll just call ). C doesn’t specify that the actual pointer points addr machine address zero , because the C standard only talks of the C abstract machine, not about hardware. All C guarantees is that if you compare to zero you’ll see them equal. But for all you know that’s because the zero is converted to the native platform’s , which happens to be . It also explicitly says that dereferencing a null pointer, no matter what the value, is undefined behavior. It’s the example of UB under 3.4.3. This also means that you can’t assume that will create a pointer! You cannot initialize your structs this way and assume member pointers are ! And this does apply to most programmers. And yes, some historic machines used non-zero NULL pointers . But let’s say you have a modern machine, where is a pointer to address zero, and you actually have an object there. Again, C 6.3.2.3 says that compares unequal to “any object or function”. So this is UB: C says “there is no function there”. For all you know the compiler has no internal way to even express your intention here. You may argue that “but surely it’ll just emit a call instruction to the bit pattern of all zeroes? Nothing else seems reasonable. What is “all zeroes”, though? On 16bit x86, is it ? Is it ? This is UB: This is not: Because the argument needs to be a pointer, and the macro may be misinterpreted as an integer zero. Similarly, this is UB: It needs to be: So how do you print an ? Well, you could cast them to and print them using . But is even unsigned? Oh well, worst case you get a nonsense value printed instead of , I guess. Sure, you probably knew this. But did you consider the security aspects of it? It’s not rare for the denominator to come from untrusted input. And there’s so much more. The C23 standard contains 283 uses of the word “undefined”. And that’s not even including the things that are undefined by omission. Nobody can find integer promotion rules and code skimming speeds. Nobody . This post is already long enough, but as a start: Point an LLM at ANY C code, asking it to find UB, and it will. And it’ll be right almost all the time, nowadays. I felt a bit bad after it correctly found ones in my code, so I thought I’d point it at the mature and pedantically written OpenBSD. I just picked the first tool I could think of, , and it spit out a bunch. I sent the project a patch for an out of bounds write (and also for a non-UB logic bug ). I didn’t send them patches for the UB that was left and right, partly because the OpenBSD project has not been very receptive in the past for bug reports, my sense of “this is probably fine, in practice”, and that if OpenBSD wants to weed out UB from their code base, then that’s a major project that should be done in a better way than me just being the middle man between the LLM and them for a patch here and there. We can’t just throw away our C/C++ code bases. But leaving them inherently broken is also not an option. We need some way of fixing UB at scale, without committing AI slop nor overwhelming human reviewers. This too is not a new opinion, nor a great revelation. But yes, writing C/C++ in 2026 without an LLM supervising you for UB should probably be seen as a SOX violation, and just plain irresponsible. If OpenBSD people can’t find these problems gives 30+ years, what chance do the rest of us have? It may not scale to large code bases, but for my own projects I’ve asked the LLM to find UB, if necessary explain it, and fix it. And then stare at the output until I can confirm the issue and the fix. A problem with this is that in order to confirm the findings, you’ll need an expert human. But generally expert humans are busy doing other things. This is janitor work, but too subtle to leave to the junior programmers who have traditionally been assigned janitor work. kernel gave a helping hand (Alpha for some loads) crash (other Alpha loads, and SPARC) not a problem (x86) No way to parse integers in C Integer handling is broken UB in the Linux kernel Integer promotion

0 views

CISA Admin Leaked AWS GovCloud Keys on Github

Until this past weekend, a contractor for the Cybersecurity & Infrastructure Security Agency (CISA) maintained a public GitHub repository that exposed credentials to several highly privileged AWS GovCloud accounts and a large number of internal CISA systems. Security experts said the public archive included files detailing how CISA builds, tests and deploys software internally, and that it represents one of the most egregious government data leaks in recent history. On May 15, KrebsOnSecurity heard from Guillaume Valadon , a researcher with the security firm GitGuardian . Valadon’s company constantly scans public code repositories at GitHub and elsewhere for exposed secrets, automatically alerting the offending accounts of any apparent sensitive data exposures. Valadon said he reached out because the owner in this case wasn’t responding and the information exposed was highly sensitive. A redacted screenshot of the now-defunct “Private CISA” repository maintained by a CISA contractor. The GitHub repository that Valadon flagged was named “ Private-CISA ,” and it harbored a vast number of internal CISA/DHS credentials and files, including cloud keys, tokens, plaintext passwords, logs and other sensitive CISA assets. Valadon said the exposed CISA credentials represent a textbook example of poor security hygiene, noting that the commit logs in the offending GitHub account show that the CISA administrator disabled the default setting in GitHub that blocks users from publishing SSH keys or other secrets in public code repositories. “Passwords stored in plain text in a csv, backups in git, explicit commands to disable GitHub secrets detection feature,” Valadon wrote in an email. “I honestly believed that it was all fake before analyzing the content deeper. This is indeed the worst leak that I’ve witnessed in my career. It is obviously an individual’s mistake, but I believe that it might reveal internal practices.” One of the exposed files, titled “importantAWStokens,” included the administrative credentials to three Amazon AWS GovCloud servers. Another file exposed in their public GitHub repository — “AWS-Workspace-Firefox-Passwords.csv” — listed plaintext usernames and passwords for dozens of internal CISA systems. According to Caturegli, those systems included one called “LZ-DSO,” which appears short for “Landing Zone DevSecOps,” the agency’s secure code development environment. Philippe Caturegli , founder of the security consultancy Seralys , said he tested the AWS keys only to see whether they were still valid and to determine which internal systems the exposed accounts could access. Caturegli said the GitHub account that exposed the CISA secrets exhibits a pattern consistent with an individual operator using the repository as a working scratchpad or synchronization mechanism rather than a curated project repository. “The use of both a CISA-associated email address and a personal email address suggests the repository may have been used across differently configured environments,” Caturegli observed. “The available Git metadata alone does not prove which endpoint or device was used.” The Private CISA GitHub repo exposed dozens of plaintext credentials for important CISA GovCloud resources. Caturegli said he validated that the exposed credentials could authenticate to three AWS GovCloud accounts at a high privilege level. He said the archive also includes plain text credentials to CISA’s internal “artifactory” — essentially a repository of all the code packages they are using to build software — and that this would represent a juicy target for malicious attackers looking for ways to maintain a persistent foothold in CISA systems. “That would be a prime place to move laterally,” he said. “Backdoor in some software packages, and every time they build something new they deploy your backdoor left and right.” In response to questions, a spokesperson for CISA said the agency is aware of the reported exposure and is continuing to investigate the situation. “Currently, there is no indication that any sensitive data was compromised as a result of this incident,” the CISA spokesperson wrote. “While we hold our team members to the highest standards of integrity and operational awareness, we are working to ensure additional safeguards are implemented to prevent future occurrences.” A review of the GitHub account and its exposed passwords show the “Private CISA” repository was maintained by an employee of Nightwing , a government contractor based in Dulles, Va. Nightwing declined to comment, directing inquiries to CISA. CISA has not responded to questions about the potential duration of the data exposure, but Caturegli said the Private CISA repository was created on November 13, 2025. The contractor’s GitHub account was created back in September 2018. The GitHub account that included the Private CISA repo was taken offline shortly after both KrebsOnSecurity and Seralys notified CISA about the exposure. But Caturegli said the exposed AWS keys inexplicably continued to remain valid for another 48 hours. CISA is currently operating with only a fraction of its normal budget and staffing levels. The agency has lost nearly a third of its workforce since the beginning of the second Trump administration, which forced a series of early retirements, buyouts, and resignations across the agency’s various divisions. The now-defunct Private CISA repo showed the contractor also used easily-guessed passwords for a number of internal resources; for example, many of the credentials used a password consisting of each platform’s name followed by the current year. Caturegli said such practices would constitute a serious security threat for any organization even if those credentials were never exposed externally, noting that threat actors often use key credentials exposed on the internal network to expand their reach after establishing initial access to a targeted system. “What I suspect happened is [the CISA contractor] was using this GitHub to synchronize files between a work laptop and a home computer, because he has regularly committed to this repo since November 2025,” Caturegli said. “This would be an embarrassing leak for any company, but it’s even more so in this case because it’s CISA.”

0 views
ava's blog 1 weeks ago

i suspect there is a new uber data leak

For the last 2 weeks, I have received 2FA codes for my UberEats account that I did not request roughly every couple days. The account was only created a year ago and used twice, then never again. The email address and password were created specifically for it and are not used elsewhere. No other accounts and services I use are affected. A quick search shows others have been dealing with the same very recently (~3 weeks). I logged in and tried to start the account deletion process. It ends in a white screen with no confirmation, and you get no “Sad to see you go” email or anything else. If you’re lucky, you’re forcibly logged out and take it as a sign that it worked. That’s a shitty process. The info says they deactivate for 30 days and then fully delete, and I had hoped that would stop the 2FA requests, but alas, it did not. They still let you attempt log in and send a code, and I have no idea if that stops the deletion process or resets the 30 days. As I did not want a random person thwarting my account deletion, I once again logged in, changed all personal information and the password, and started the account deletion again (same bullshit). I have contacted their privacy team to let them know. We’ll see what they say. I also requested them to delete my account in case the deletion process failed. I haven’t yet seen any notice about this on their website or in the media. 🤷🏻‍♀️ If you have an Uber/UberEats account, consider changing your password. Reply via email Published 17 May, 2026

0 views
Kev Quirk 2 weeks ago

Is Bitwarden preparing for a sale?

by Jan-Lukas Else Jan-Lukas writes about the warning signs that Bitwarden might be heading for a private equity sale. The irony is that founder built Bitwarden because he didn't trust what happened when LastPass got acquired. Read post ➡ I saw this on the fedi this morning and it made me let out a big sigh. I was an early adopter of Bitwarden, having used it for nearly 10 years at this point, after LastPass were acquired by LogMeIn . If this does come to fruition (I really hope it doesn't) I'm not sure what I'd do. My wife and I have a family account and share many credentials, so whatever I potentially flip to would need to be super simple to use, like Bitwarden. The fact that Bitwarden is so simple yo use, yet so secure , is a testament to how good of a product it really is. So I'd rather not jump ship. In the Fast Company post that Jan-Lukas links to, there's a quote following an email from Bitwarden's "chief customer officer", Gary Orenstein, saying: Orenstein says via email that Bitwarden is not seeking a buyer, and that Sullivan’s [new CEO] appointment “reflects a continued focus at Bitwarden on scaling the business and serving customers globally.” That gives me some hope, but it could also be corporate bullshit - let's be honest, it wouldn't be the first time. I'm not going to make any rash decisions though. I get a tonne of use from Bitwarden, so I don't want to move unless I have to. Even if they are sold, I'd have to consider my options once I know who they've potentially been sold to. For now it's business as usual for me and my password manager. Thanks for reading this post via RSS. RSS is ace, and so are you. ❤️ You can reply to this post by email , or leave a comment .

0 views