Latest Posts (7 found)
devansh Today

Reflections on my 5 years at HackerOne

Today marks 5 years at HackerOne for me. I joined in 2020 as a Product Security Analyst while I was still an undergrad student. I’m grateful to now be serving as a Team Lead (Technical Services). A few reflections: Grateful for the people at HackerOne who took chances on me, challenged my thinking, and trusted me with more responsibility than I thought I was ready for. An even bigger thanks to the hackers whose reports I’ve had the chance to read over all these years. Five years in, still learning, still a work in progress :) None of this is solo. Good managers, patient teammates, and sharp hackers did more for my growth than any “self-made” narrative. Title changes are visible; real growth is not. It’s in how you listen, decide, and own mistakes. Luck is underrated. Being in a high-trust, high-talent environment at the right time matters more than we admit. "I don’t know" is not a weakness. It’s usually the start of the right conversation. As an Individual contributor, you optimize for being right. As a lead, you optimize for the team being effective. Very different job. Escalations and incidents expose culture fast. Blame travels down; responsibility travels up. Saying "no" clearly is kinder than saying "yes" and disappearing. Tools change every year. Principles - ownership, clarity, curiosity - don’t. If you stop learning, your experience is just 1 year repeated 5 times. Constraints are not excuses, they are design inputs for how you grow. Reading reports from hackers is a privilege, a free, continuous education from some of the sharpest minds on the internet. The hardest shift is from “How do I prove myself?” to “How do I make others successful?”. Calm execution during chaos beats heroic last-minute rescue every single time. Depth compounds. Understanding one concept end-to-end teaches you more than skimming ten. Feedback that makes you uncomfortable is usually the feedback you needed two months ago. High standards without empathy create fear. Empathy without standards creates mediocrity. You need both. You outgrow roles faster than you outgrow habits. Updating your habits is the real promotion. If everything is urgent, nothing is important. Prioritization is a leadership skill, not a calendar trick. Writing forces clarity. If you can’t explain it simply, you probably don’t understand it yet. Most “communication issues” are unasked questions and unspoken assumptions. Systems outlive heroes. Fix the system, don’t search for a savior. Being technically right and practically useless is still a miss. A 1% better process, repeated daily, beats a once-a-year “big transformation”. You can borrow context, but you can’t outsource judgment. That part you have to earn. Your manager sees some of the picture. Customers see another part. Hackers see yet another. Listen to all three. Imposter syndrome never fully leaves. You just learn to move with it instead of freezing because of it. Generosity with knowledge is not optional. Someone did it for you when you had nothing to trade. Gratitude is a strategy, not just a feeling. It keeps you curious, grounded, and willing to start at zero again. Stay hungry, very very hungry . The real hunger for growth can’t be fully satisfied, the moment it feels “enough,” it was never true hunger. The goalpost should keep moving, not out of insecurity, but out of a genuine desire to keep stretching what you can learn, build, and contribute.

0 views
devansh 3 weeks ago

Hitchhiker's Guide to Attack Surface Management

I first heard about the word "ASM" (i.e., Attack Surface Management) probably in late 2018, and I thought it must be some complex infrastructure for tracking assets of an organization. Looking back, I realize I almost had a similar stack for discovering, tracking, and detecting obscure assets of organizations, and I was using it for my bug hunting adventures. I feel my stack was kinda goated, as I was able to find obscure assets of Apple, Facebook, Shopify, Twitter, and many other Fortune 100 companies, and reported hundreds of bugs, all through automation. Back in the day, projects like ProjectDiscovery were not present, so if I had to write an effective port scanner, I had to do it from scratch. (Masscan and nmap were present, but I had my fair share of issues using them, this is a story for another time). I used to write DNS resolvers (massdns had a high error rate), port scanners, web scrapers, directory brute-force utilities, wordlists, lots of JavaScript parsing logic using regex, and a hell of a lot of other things. I used to have up to 50+ self-developed tools for bug-bounty recon stuff and another 60-something helper scripts written in bash. I used to orchestrate (gluing together with duct tape is a better word) and slap together scripts like a workflow, and save the output in text files. Whenever I dealt with a large number of domains, I used to distribute the load over multiple servers (server spin-up + SSH into it + SCP for pushing and pulling files from it). The setup was very fragile and error-prone, and I spent countless nights trying to debug errors in the workflows. But it was all worth it. I learned the art of Attack Surface Management without even trying to learn about it. I was just a teenager trying to make quick bucks through bug hunting, and this fragile, duct-taped system was my edge. Fast forward to today, I have now spent almost a decade in the bug bounty scene. I joined HackerOne in 2020 (to present) as a vulnerability triager, where I have triaged and reviewed tens of thousands of vulnerability submissions. Fair to say, I have seen a lot of things, from doomsday level 0-days, to reports related to leaked credentials which could have led to entire infrastructure compromise, just because some dev pushed an AWS secret key in git logs, to things where some organizations were not even aware they were running Jenkins servers on some obscure subdomain which could have allowed RCE and then lateral movement to other layers of infrastructure. A lot of these issues I have seen were totally avoidable, only if organizations followed some basic attack surface management techniques. If I search "Guide to ASM" on Internet, almost none of the supposed guides are real resources. They funnel you to their own ASM solution, and the guide is just present there to provide you with some surface-level information, and is mostly a marketing gimmick. This is precisely why I decided to write something where I try to cover everything I learned and know about ASM, and how to protect your organization's assets before bad actors could get to them. This is going to be a rough and raw guide, and will not lead you to a funnel where I am trying to sell my own ASM SaaS to you. I have nothing to sell, other than offering what I know. But in case you are an organization who needs help implementing the things I am mentioning below, you can reach out to me via X or email (both available on the homepage of this blog). This guide will provide you with insights into exactly how big your attack surface really is. CISOs can look at it and see if their organizations have all of these covered, security researchers and bug hunters can look at this and maybe find new ideas related to where to look during recon. Devs can look at it and see if they are unintentionally leaving any door open for hackers. If you are into security, it has something to offer you. Attack surface is one of those terms getting thrown around in security circles so much that it's become almost meaningless noise. In theory, it sounds simple enough, right. Attack surface is every single potential entry point, interaction vector, or exploitable interface an attacker could use to compromise your systems, steal your data, or generally wreck your day. But here's the thing, it's the sum total of everything you've exposed to the internet. Every API endpoint you forgot about, every subdomain some dev spun up for "testing purposes" five years ago and then abandoned, every IoT device plugged into your network, every employee laptop connecting from a coffee shop, every third-party vendor with a backdoor into your environment, every cloud storage bucket with permissions that make no sense, every Slack channel, every git commit leaking credentials, every paste on Pastebin containing your database passwords. Most organizations think about attack surface in incredibly narrow terms. They think if they have a website, an email server, and maybe some VPN endpoints, they've got "good visibility" into their assets. That's just plain wrong. Straight up wrong. Your actual attack surface would terrify you if you actually understood it. You run , and is your main domain. You probably know about , , maybe . But what about that your intern from 2015 spun up and just never bothered to delete. It's not documented anywhere. Nobody remembers it exists. Domain attack surface goes way beyond what's sitting in your asset management system. Every subdomain is a potential entry point. Most of these subdomains are completely forgotten. Subdomain enumeration is reconnaissance 101 for attackers and bug hunters. It's not rocket science. Setting up a tool that actively monitors through active and passive sources for new subdomains and generates alerts is honestly an hour's worth of work. You can use tools like Subfinder, Amass, or just mine Certificate Transparency logs to discover every single subdomain connected to your domain. Certificate Transparency logs were designed to increase security by making certificate issuance public, and they've become an absolute reconnaissance goldmine. Every time you get an SSL certificate for , that information is sitting in public logs for anyone to find. Attackers systematically enumerate these subdomains using Certificate Transparency log searches, DNS brute-forcing with massive wordlists, reverse DNS lookups to map IP ranges back to domains, historical DNS data from services like SecurityTrails, and zone transfer exploitation if your DNS is misconfigured. Attackers are looking for old development environments still running vulnerable software, staging servers with production data sitting on them, forgotten admin panels, API endpoints without authentication, internal tools accidentally exposed, and test environments with default credentials nobody changed. Every subdomain is an asset. Every asset is a potential vulnerability. Every vulnerability is an entry point. Domains and subdomains are just the starting point though. Once you've figured out all the subdomains belonging to your organization, the next step is to take a hard look at IP address space, which is another absolutely massive component of your attack surface. Organizations own, sometimes lease, IP ranges, sometimes small /24 blocks, sometimes massive /16 ranges, and every single IP address in those blocks and ranges that responds to external traffic is part of your attack surface. And attackers enumerate them all if you won't. They use WHOIS lookups to identify your IP ranges, port scanning to find what services are running where, service fingerprinting to identify exact software versions, and banner grabbing to extract configuration information. If you have a /24 network with 256 IP addresses and even 10% of those IPs are running services, you've got 25 potential attack vectors. Scale that to a /20 or /16 and you're looking at thousands of potential entry points. And attackers aren't just looking at the IPs you know about. They're looking at adjacent IP ranges you might have acquired through mergers, historical IP allocations that haven't been properly decommissioned, and shared IP ranges where your servers coexist with others. Traditional infrastructure was complicated enough, and now we have cloud. It's literally exploded organizations' attack surfaces in ways that are genuinely difficult to even comprehend. Every cloud service you spin up, be it an EC2 instance, S3 bucket, Lambda function, or API Gateway endpoint, all of this is a new attack vector. In my opinion and experience so far, I think the main issue with cloud infrastructure is that it's ephemeral and distributed. Resources get spun up and torn down constantly. Developers create instances for testing and forget about them. Auto-scaling groups generate new resources dynamically. Containerized workloads spin up massive Kubernetes clusters you have minimal visibility into. Your cloud attack surface could be literally anything. Examples are countless, but I'd categorize them into 8 different categories. Compute instances like EC2, Azure VMs, GCP Compute Engine instances exposed to the internet. Storage buckets like S3, Azure Blob Storage, GCP Cloud Storage with misconfigured permissions. Serverless stuff like Lambda functions with public URLs or overly permissive IAM roles. API endpoints like API Gateway, Azure API Management endpoints without proper authentication. Container registries like Docker images with embedded secrets or vulnerabilities. Kubernetes clusters with exposed API servers, misconfigured network policies, vulnerable ingress controllers. Managed databases like RDS, CosmosDB, Cloud SQL instances with weak access controls. IAM roles and service accounts with overly permissive identities that enable privilege escalation. I've seen instances in the past where a single misconfigured S3 bucket policy exposed terabytes of data. An overly permissive Lambda IAM role enabled lateral movement across an entire AWS account. A publicly accessible Kubernetes API server gave an attacker full cluster control. Honestly, cloud kinda scares me as well. And to top it off, multi-cloud infrastructure makes everything worse. If you're running AWS, Azure, and GCP together, you've just tripled your attack surface management complexity. Each cloud provider has different security models, different configuration profiles, and different attack vectors. Every application now uses APIs, and all applications nowadays are like a constellation of APIs talking to each other. Every API you use in your organization is your attack surface. The problem with APIs is that they're often deployed without the same security scrutiny as traditional web applications. Developers spin up API endpoints for specific features and those endpoints accumulate over time. Some of them are shadow APIs, meaning API endpoints which aren't documented anywhere. These endpoints are the equivalent of forgotten subdomains, and attackers can find them through analyzing JavaScript files for API endpoint references, fuzzing common API path patterns, examining mobile app traffic to discover backend APIs, and mining old documentation or code repositories for deprecated endpoints. Your API attack surface includes REST APIs exposed to the internet, GraphQL endpoints with overly broad query capabilities, WebSocket connections for real-time functionality, gRPC services for inter-service communication, and legacy SOAP APIs that never got decommissioned. If your organization has mobile apps, be it iOS, Android, or both, this is a direct window to your infrastructure and should be part of your attack surface management strategy. Mobile apps communicate with backend APIs and those API endpoints are discoverable by reversing the app. The reversed source of the app could reveal hard-coded API keys, tokens, and credentials. Using JADX plus APKTool plus Dex2jar is all a motivated attacker needs. Web servers often expose directories and files that weren't meant to be publicly accessible. Attackers systematically enumerate these using automated tools like ffuf, dirbuster, gobuster, and wfuzz with massive wordlists to discover hidden endpoints, configuration files, backup files, and administrative interfaces. Common exposed directories include admin panels, backup directories containing database dumps or source code, configuration files with database credentials and API keys, development directories with debug information, documentation directories revealing internal systems, upload directories for file storage, and old or forgotten directories from previous deployments. Your attack surface must include directories which are accidentally left accessible during deployments, staging servers with production data, backup directories with old source code versions, administrative interfaces without authentication, API documentation exposing endpoint details, and test directories with debug output enabled. Even if you've removed a directory from production, old cached versions may still be accessible through web caches or CDNs. Search engines also index these directories, making them discoverable through dorking techniques. If your organization is using IoT devices, and everyone uses these days, this should be part of your attack surface management strategy. They're invisible to traditional security tools. Your EDR solution doesn't protect IoT devices. Your vulnerability scanner can't inventory them. Your patch management system can't update them. Your IoT attack surface could include smart building systems like HVAC, lighting, access control. Security cameras and surveillance systems. Printers and copiers, which are computers with network access. Badge readers and physical access systems. Industrial control systems and SCADA devices. Medical devices in healthcare environments. Employee wearables and fitness trackers. Voice assistants and smart speakers. The problem with IoT devices is that they're often deployed without any security consideration. They have default credentials that never get changed, unpatched firmware with known vulnerabilities, no encryption for data in transit, weak authentication mechanisms, and insecure network configurations. Social media presence is an attack surface component that most organizations completely ignore. Attackers can use social media for reconnaissance by looking at employee profiles on LinkedIn to reveal organizational structure, technologies in use, and current projects. Twitter/X accounts can leak information about deployments, outages, and technology stack. Employee GitHub profiles expose email patterns and development practices. Company blogs can announce new features before security review. It could also be a direct attack vector. Attackers can use information from social media to craft convincing phishing attacks. Hijacked social media accounts can be used to spread malware or phishing links. Employees can accidentally share sensitive information. Fake accounts can impersonate your brand to defraud customers. Your employees' social media presence is part of your attack surface whether you like it or not. Third-party vendors, suppliers, contractors, or partners with access to your systems should be part of your attack surface. Supply chain attacks are becoming more and more common these days. Attackers can compromise a vendor with weaker security and then use that vendor's access to reach your environment. From there, they pivot from the vendor network to your systems. This isn't a hypothetical scenario, it has happened multiple times in the past. You might have heard about the SolarWinds attack, where attackers compromised SolarWinds' build system and distributed malware through software updates to thousands of customers. Another famous case study is the MOVEit vulnerability in MOVEit Transfer software, exploited by the Cl0p ransomware group, which affected over 2,700 organizations. These are examples of some high-profile supply chain security attacks. Your third-party attack surface could include things like VPNs, remote desktop connections, privileged access systems, third-party services with API keys to your systems, login credentials shared with vendors, SaaS applications storing your data, and external IT support with administrative access. It's obvious you can't directly control third-party security. You can audit them, have them pen-test their assets as part of your vendor compliance plan, and include security requirements in contracts, but ultimately their security posture is outside your control. And attackers know this. GitHub, GitLab, Bitbucket, they all are a massive attack surface. Attackers search through code repositories in hopes of finding hard-coded credentials like API keys, database passwords, and tokens. Private keys, SSH keys, TLS certificates, and encryption keys. Internal architecture documentation revealing infrastructure details in code comments. Configuration files with database connection strings and internal URLs. Deprecated code with vulnerabilities that's still in production. Even private repositories aren't safe. Attackers can compromise developer accounts to access private repositories, former employees retain access after leaving, and overly broad repository permissions grant access to too many people. Automated scanners continuously monitor public repositories for secrets. The moment a developer accidentally pushes credentials to a public repository, automated systems detect it within minutes. Attackers have already extracted and weaponized those credentials before the developer realizes the mistake. CI/CD pipelines are massive another attack vector. Especially in recent times, and not many organizations are giving attention to this attack vector. This should totally be part of your attack surface management. Attackers compromise GitHub Actions workflows with malicious code injection, Jenkins servers with weak authentication, GitLab CI/CD variables containing secrets, and build artifacts with embedded malware. The GitHub Actions supply chain attack, CVE-2025-30066, demonstrated this perfectly. Attackers compromised the Action used in over 23,000 repositories, injecting malicious code that leaked secrets from build logs. Jenkins specifically is a goldmine for attackers. An exposed Jenkins instance provides complete control over multiple critical servers, access to hardcoded AWS keys, Redis credentials, and BitBucket tokens, ability to manipulate builds and inject malicious code, and exfiltration of production database credentials containing PII. Modern collaboration tools are massive attack surface components that most organizations underestimate. Slack has hidden security risks despite being invite-only. Slack attack surface could include indefinite data retention where every message, channel, and file is stored forever unless admins configure retention periods. Public channels accessible to all users so one breached account opens the floodgates. Third-party integrations with excessive permissions accessing messages and user data. Former contractor access where individuals retain access long after projects end. Phishing and impersonation where it's easy to change names and pictures to impersonate senior personnel. In 2022, Slack leaked hashed passwords for five years affecting 0.5% of users. Slack channels commonly contain API keys, authentication tokens, database credentials, customer PII, financial data, internal system passwords, and confidential project information. The average cost of a breached record was $164 in 2022. When 1 in 166 messages in Slack contains confidential information, every new message adds another dollar to total risk exposure. With 5,000 employees sending 30 million Slack messages per year, that's substantial exposure. Trello board exposure is a significant attack surface. Trello attack vectors include public boards with sensitive information accidentally shared publicly, default public visibility where boards are created as public by default in some configurations, unsecured REST API allowing unauthenticated access to user data, and scraping attacks where attackers use email lists to enumerate Trello accounts. The 2024 Trello data breach exposed 15 million users' personal information when a threat actor named "emo" exploited an unsecured REST API using 500 million email addresses to compile detailed user profiles. Security researcher David Shear documented hundreds of public Trello boards exposing passwords, credentials, IT support customer access details, website admin logins, and client server management credentials. IT companies were using Trello to troubleshoot client requests and manage infrastructure, storing all credentials on public Trello boards. Jira misconfiguration is a widespread attack surface issue. Common misconfigurations include public dashboards and filters with "Everyone" access actually meaning public internet access, anonymous access enabled allowing unauthenticated users to browse, user picker functionality providing complete lists of usernames and email addresses, and project visibility allowing sensitive projects to be accessible without authentication. Confluence misconfiguration exposes internal documentation. Confluence attack surface components include anonymous access at site level allowing public access, public spaces where space admins grant anonymous permissions, inherited permissions where all content within a space inherits space-level access, and user profile visibility allowing anonymous users to view profiles of logged-in users. When anonymous access is enabled globally and space admins allow anonymous users to access their spaces, anyone on the internet can access that content. Confluence spaces often contain internal documentation with hardcoded credentials, financial information, project details, employee information, and API documentation with authentication details. Cloud storage misconfiguration is epidemic. Google Drive misconfiguration attack surface includes "Anyone with the link" sharing making files accessible without authentication, overly permissive sharing defaults making it easy to accidentally share publicly, inherited folder permissions exposing everything beneath, unmanaged third-party apps with excessive read/write/delete permissions, inactive user accounts where former employees retain access, and external ownership blind spots where externally-owned content is shared into the environment. Metomic's 2023 Google Scanner Report found that of 6.5 million Google Drive files analyzed, 40.2% contained sensitive information, 34.2% were shared externally, and 0.5% were publicly accessible, mostly unintentionally. In December 2023, Japanese game developer Ateam suffered a catastrophic Google Drive misconfiguration that exposed personal data of nearly 1 million people for over six years due to "Anyone with the link" settings. Based on Valence research, 22% of external data shares utilize open links, and 94% of these open link shares are inactive, forgotten files with public URLs floating around the internet. Dropbox, OneDrive, and Box share similar attack surface components including misconfigured sharing permissions, weak or missing password protection, overly broad access grants, third-party app integrations with excessive permissions, and lack of visibility into external sharing. Features that make file sharing convenient create data leakage risks when misconfigured. Pastebin and similar paste sites are both reconnaissance sources and attack vectors. Paste site attack surface includes public data dumps of stolen credentials, API keys, and database dumps posted publicly, malware hosting of obfuscated payloads, C2 communications where malware uses Pastebin for command and control, credential leakage from developers accidentally posting secrets, and bypassing security filters since Pastebin is legitimate so security tools don't block it. For organizations, leaked API keys or database credentials on Pastebin lead to unauthorized access, data exfiltration, and service disruption. Attackers continuously scan Pastebin for mentions of target organizations using automated tools. Security teams must actively monitor Pastebin and similar paste sites for company name mentions, email domain references, and specific keywords related to the organization. Because paste sites don't require registration or authentication and content is rarely removed, they've become permanent archives of leaked secrets. Container registries expose significant attack surface. Container registry attack surface includes secrets embedded in image layers where 30,000 unique secrets were found in 19,000 images, with 10% of scanned Docker images containing secrets, and 1,200 secrets, 4%, being active and valid. Immutable cached layers contain 85% of embedded secrets that can't be removed, exposed registries with 117 Docker registries accessible without authentication, unsecured registries allowing pull, push, and delete operations, and source code exposure where full application code is accessible by pulling images. GitGuardian's analysis of 200,000 publicly available Docker images revealed a staggering secret exposure problem. Even more alarming, 99% of images containing active secrets were pulled in 2024, demonstrating real-world exploitation. Unit 42's research identified 941 Docker registries exposed to the internet, with 117 accessible without authentication containing 2,956 repositories, 15,887 tags, and full source code and historical versions. Out of 117 unsecured registries, 80 allow pull operations to download images, 92 allow push operations to upload malicious images, and 7 allow delete operations for ransomware potential. Sysdig's analysis of over 250,000 Linux images on Docker Hub found 1,652 malicious images including cryptominers, most common, embedded secrets, second most prevalent, SSH keys and public keys for backdoor implants, API keys and authentication tokens, and database credentials. The secrets found in container images included AWS access keys, database passwords, SSH private keys, API tokens for cloud services, GitHub personal access tokens, and TLS certificates. Shadow IT includes unapproved SaaS applications like Dropbox, Google Drive, and personal cloud storage used for work. Personal devices like BYOD laptops, tablets, and smartphones accessing corporate data. Rogue cloud deployments where developers spin up AWS instances without approval. Unauthorized messaging apps like WhatsApp, Telegram, and Signal used for business communication. Unapproved IoT devices like smart speakers, wireless cameras, and fitness trackers on the corporate network. Gartner estimates that shadow IT makes up 30-40% of IT spending in large companies, and 76% of organizations surveyed experienced cyberattacks due to exploitation of unknown, unmanaged, or poorly managed assets. Shadow IT expands your attack surface because it's not protected by your security controls, it's not monitored by your security team, it's not included in your vulnerability scans, it's not patched by your IT department, and it often has weak or default credentials. And you can't secure what you don't know exists. Bring Your Own Device, BYOD, policies sound great for employee flexibility and cost savings. For security teams, they're a nightmare. BYOD expands your attack surface by introducing unmanaged endpoints like personal devices without EDR, antivirus, or encryption. Mixing personal and business use where work data is stored alongside personal apps with unknown security. Connecting from untrusted networks like public Wi-Fi and home networks with compromised routers. Installing unapproved applications with malware or excessive permissions. Lacking consistent security updates with devices running outdated operating systems. Common BYOD security issues include data leakage through personal cloud backup services, malware infections from personal app downloads, lost or stolen devices containing corporate data, family members using devices that access work systems, and lack of IT visibility and control. The 60% of small and mid-sized businesses that close within six months of a major cyberattack often have BYOD-related security gaps as contributing factors. Remote access infrastructure like VPNs and Remote Desktop Protocol, RDP, are among the most exploited attack vectors. SSL VPN appliances from vendors like Fortinet, SonicWall, Check Point, and Palo Alto are under constant attack. VPN attack vectors include authentication bypass vulnerabilities with CVEs allowing attackers to hijack active sessions, credential stuffing through brute-forcing VPN logins with leaked credentials, exploitation of unpatched vulnerabilities with critical CVEs in VPN appliances, and configuration weaknesses like default credentials, weak passwords, and lack of MFA. Real-world attacks demonstrate the risk. Check Point SSL VPN CVE-2024-24919 allowed authentication bypass for session hijacking. Fortinet SSL-VPN vulnerabilities were leveraged for lateral movement and privilege escalation. SonicWall CVE-2024-53704 allowed remote authentication bypass for SSL VPN. Once inside via VPN, attackers conduct network reconnaissance, lateral movement, and privilege escalation. RDP is worse. Sophos found that cybercriminals abused RDP in 90% of attacks they investigated. External remote services like RDP were the initial access vector in 65% of incident response cases. RDP attack vectors include exposed RDP ports with port 3389 open to the internet, weak authentication with simple passwords vulnerable to brute force, lack of MFA with no second factor for authentication, and credential reuse from compromised passwords in data breaches. In one Darktrace case, attackers compromised an organization four times in six months, each time through exposed RDP ports. The attack chain went successful RDP login, internal reconnaissance via WMI, lateral movement via PsExec, and objective achievement. The Palo Alto Unit 42 Incident Response report found RDP was the initial attack vector in 50% of ransomware deployment cases. Email infrastructure remains a primary attack vector. Your email attack surface includes mail servers like Exchange, Office 365, and Gmail with configuration weaknesses, email authentication with misconfigured SPF, DKIM, and DMARC records, phishing-susceptible users targeted through social engineering, email attachments and links as malware delivery mechanisms, and compromised accounts through credential stuffing or password reuse. Email authentication misconfiguration is particularly insidious. If your SPF, DKIM, and DMARC records are wrong or missing, attackers can spoof emails from your domain, your legitimate emails get marked as spam, and phishing emails impersonating your organization succeed. Email servers themselves are also targets. The NSA released guidance on Microsoft Exchange Server security specifically because Exchange servers are so frequently compromised. Container orchestration platforms like Kubernetes introduce massive attack surface complexity. The Kubernetes attack surface includes the Kubernetes API server with exposed or misconfigured API endpoints, container images with vulnerabilities in base images or application layers, container registries like Docker Hub, ECR, and GCR with weak access controls, pod security policies with overly permissive container configurations, network policies with insufficient micro-segmentation between pods, secrets management with hardcoded secrets or weak secret storage, and RBAC misconfigurations with overly broad service account permissions. Container security issues include containers running as root with excessive privileges, exposed Docker daemon sockets allowing container escape, vulnerable dependencies in container images, and lack of runtime security monitoring. The Docker daemon attack surface is particularly concerning. Running containers with privileged access or allowing docker.sock access can enable container escape and host compromise. Serverless computing like AWS Lambda, Azure Functions, and Google Cloud Functions promised to eliminate infrastructure management. Instead, it just created new attack surfaces. Serverless attack surface components include function code vulnerabilities like injection flaws and insecure dependencies, IAM misconfigurations with overly permissive Lambda execution roles, environment variables storing secrets as plain text, function URLs with publicly accessible endpoints without authentication, and event source mappings with untrusted input from various cloud services. The overabundance of event sources expands the attack surface. Lambda functions can be triggered by S3 events, API Gateway requests, DynamoDB streams, SNS topics, EventBridge schedules, IoT events, and dozens more. Each event source is a potential injection point. If function input validation is insufficient, attackers can manipulate event data to exploit the function. Real-world Lambda attacks include credential theft by exfiltrating IAM credentials from environment variables, lateral movement using over-permissioned roles to access other AWS resources, and data exfiltration by invoking functions to query and extract database contents. The Scarlet Eel adversary specifically targeted AWS Lambda for credential theft and lateral movement. Microservices architecture multiplies attack surface by decomposing monolithic applications into dozens or hundreds of independent services. Each microservice has its own attack surface including authentication mechanisms where each service needs to verify requests, authorization rules where each service enforces access controls, API endpoints for service-to-service communication channels, data stores where each service may have its own database, and network interfaces where each service exposes network ports. Microservices security challenges include east-west traffic vulnerabilities with service-to-service communication without encryption or authentication, authentication and authorization complexity from managing auth across 40 plus services multiplied by 3 environments equaling 240 configurations, service-to-service trust where services blindly trust internal traffic, network segmentation failures with flat networks allowing unrestricted pod-to-pod communication, and inconsistent security policies with different services having different security standards. One compromised microservice can enable lateral movement across the entire application. Without proper network segmentation and zero trust architecture, attackers pivot from service to service. How do you measure something this large, right. Attack surface measurement is complex. Attack surface metrics include the total number of assets with all discovered systems, applications, and devices, newly discovered assets found through continuous discovery, the number of exposed assets accessible from the internet, open ports and services with network services listening for connections, vulnerabilities by severity including critical, high, medium, and low CVEs, mean time to detect, MTTD, measuring how quickly new assets are discovered, mean time to remediate, MTTR, measuring how quickly vulnerabilities are fixed, shadow IT assets that are unknown or unmanaged, third-party exposure from vendor and partner access points, and attack surface change rate showing how rapidly the attack surface evolves. Academic research has produced formal attack surface measurement methods. Pratyusa Manadhata's foundational work defines attack surface as a three-tuple, System Attackability, Channel Attackability, Data Attackability. But in practice, most organizations struggle with basic attack surface visibility, let alone quantitative measurement. Your attack surface isn't static. It changes constantly. Changes happen because developers deploy new services and APIs, cloud auto-scaling spins up new instances, shadow IT appears as employees adopt unapproved tools, acquisitions bring new infrastructure into your environment, IoT devices get plugged into your network, and subdomains get created for new projects. Static, point-in-time assessments are obsolete. You need continuous asset discovery and monitoring. Continuous discovery methods include automated network scanning for regular scans to detect new devices, cloud API polling to query cloud provider APIs for resource changes, DNS monitoring to track new subdomains via Certificate Transparency logs, passive traffic analysis to observe network traffic and identify assets, integration with CMDB or ITSM to sync with configuration management databases, and cloud inventory automation using Infrastructure as Code to track deployments. Understanding your attack surface is step one. Reducing it is the goal. Attack surface reduction begins with asset elimination by removing unnecessary assets entirely. This includes decommissioning unused servers and applications, deleting abandoned subdomains and DNS records, shutting down forgotten development environments, disabling unused network services and ports, and removing unused user accounts and service identities. Access control hardening implements least privilege everywhere by enforcing multi-factor authentication, MFA, for all remote access, using role-based access control, RBAC, for cloud resources, implementing zero trust network architecture, restricting network access with micro-segmentation, and applying the principle of least privilege to IAM roles. Exposure minimization reduces what's visible to attackers by moving services behind VPNs or bastion hosts, using private IP ranges for internal services, implementing network address translation, NAT, for outbound access, restricting API endpoints to authorized sources only, and disabling unnecessary features and functionalities. Security hardening strengthens what remains by applying security patches promptly, using security configuration baselines, enabling encryption for data in transit and at rest, implementing Web Application Firewalls, WAF, for web apps, and deploying endpoint detection and response, EDR, on all devices. Monitoring and detection watch for attacks in progress by implementing real-time threat detection, enabling comprehensive logging and SIEM integration, deploying intrusion detection and prevention systems, IDS/IPS, monitoring for anomalous behavior patterns, and using threat intelligence feeds to identify known bad actors. Your attack surface is exponentially larger than you think it is. Every asset you know about probably has three you don't. Every known vulnerability probably has ten undiscovered ones. Every third-party integration probably grants more access than you realize. Every collaboration tool is leaking more data than you imagine. Every paste site contains more of your secrets than you want to admit. And attackers know this. They're not just looking at what you think you've secured. They're systematically enumerating every possible entry point. They're mining Certificate Transparency logs for forgotten subdomains. They're scanning every IP in your address space. They're reverse-engineering your mobile apps. They're buying employee credentials from data breach databases. They're compromising your vendors to reach you. They're scraping Pastebin for your leaked secrets. They're pulling your public Docker images and extracting the embedded credentials. They're accessing your misconfigured S3 buckets and exfiltrating terabytes of data. They're exploiting your exposed Jenkins instances to compromise your entire infrastructure. They're manipulating your AI agents to exfiltrate private Notion data. The asymmetry is brutal. You have to defend every single attack vector. They only need to find one that works. So what do you do. Start by accepting that you don't have complete visibility. Nobody does. But you can work toward better visibility through continuous discovery, automated asset management, and integration of security tools that help map your actual attack surface. Implement attack surface reduction aggressively. Every asset you eliminate is one less thing to defend. Every service you shut down is one less potential vulnerability. Every piece of shadow IT you discover and bring under management is one less blind spot. Every misconfigured cloud storage bucket you fix is terabytes of data no longer exposed. Every leaked secret you rotate is one less credential floating around the internet. Adopt zero trust architecture. Stop assuming that anything, internal services, microservices, authenticated users, collaboration tools, is inherently trustworthy. Verify everything. Monitor paste sites and code repositories. Your secrets are out there. Find them before attackers weaponize them. Secure your collaboration tools. Slack, Trello, Jira, Confluence, Notion, Google Drive, and Airtable are all leaking data. Lock them down. Fix your container security. Scan images for secrets. Use secret managers instead of environment variables. Secure your registries. Harden your CI/CD pipelines. Jenkins, GitHub Actions, and GitLab CI are high-value targets. Protect them. And test your assumptions with red team exercises and continuous security testing. Your attack surface is what an attacker can reach, not what you think you've secured. The attack surface problem isn't getting better. Cloud adoption, DevOps practices, remote work, IoT proliferation, supply chain complexity, collaboration tool sprawl, and container adoption are all expanding organizational attack surfaces faster than security teams can keep up. But understanding the problem is the first step toward managing it. And now you understand exactly how catastrophically large your attack surface actually is.

1 views
devansh 3 weeks ago

AI pentest scoping playbook

Disclosure: Certain sections of this content were grammatically refined/updated using AI assistance, as English is not my first language. Organizations are throwing money at "AI red teams" who run a few prompt injection tests, declare victory, and cash checks. Security consultants are repackaging traditional pentest methodologies with "AI" slapped on top, hoping nobody notices they're missing 80% of the actual attack surface. And worst of all, the people building AI systems, the ones who should know better, are scoping engagements like they're testing a CRUD app from 2015. This guide/playbook exists because the current state of AI security testing is dangerously inadequate. The attack surface is massive. The risks are novel. The methodologies are immature. And the consequences of getting it wrong are catastrophic. These are my personal views, informed by professional experience but not representative of my employer. What follows is what I wish every CISO, security lead, and AI team lead understood before they scoped their next AI security engagement. Traditional web application pentests follow predictable patterns. You scope endpoints, define authentication boundaries, exclude production databases, and unleash testers to find SQL injection and XSS. The attack surface is finite, the vulnerabilities are catalogued, and the methodologies are mature. AI systems break all of that. First, the system output is non-deterministic . You can't write a test case that says "given input X, expect output Y" because the model might generate something completely different next time. This makes reproducibility, the foundation of security testing, fundamentally harder. Second, the attack surface is layered and interconnected . You're not just testing an application. You're testing a model (which might be proprietary and black-box), a data pipeline (which might include RAG, vector stores, and real-time retrieval), integration points (APIs, plugins, browser tools), and the infrastructure underneath (cloud services, containers, orchestration). Third, novel attack classes exist that don't map to traditional vuln categories . Prompt injection isn't XSS. Data poisoning isn't SQL injection. Model extraction isn't credential theft. Jailbreaks don't fit CVE taxonomy. The OWASP Top 10 doesn't cover this. Fourth, you might not control the model . If you're using OpenAI's API or Anthropic's Claude, you can't test the training pipeline, you can't audit the weights, and you can't verify alignment. Your scope is limited to what the API exposes, which means you're testing a black box with unknown internals. Fifth, AI systems are probabilistic, data-dependent, and constantly evolving . A model that's safe today might become unsafe after fine-tuning. A RAG system that's secure with Dataset A might leak PII when Dataset B is added. An autonomous agent that behaves correctly in testing might go rogue in production when it encounters edge cases. This isn't incrementally harder than web pentesting. It's just fundamentally different. And if your scope document looks like a web app pentest with "LLM" find-and-replaced in, you're going to miss everything that matters. Before you can scope an AI security engagement, you need to understand what you're actually testing. And most organizations don't. Here's the stack: This is the thing everyone focuses on because it's the most visible. But "the model" isn't monolithic. Base model : Is it GPT-4? Claude? Llama 3? Mistral? A custom model you trained from scratch? Each has different vulnerabilities, different safety mechanisms, different failure modes. Fine-tuning : Have you fine-tuned the base model on your own data? Fine-tuning can break safety alignment. It can introduce backdoors. It can memorize training data and leak it during inference. If you've fine-tuned, that's in scope. Instruction tuning : Have you applied instruction-tuning or RLHF to shape model behavior? That's another attack surface. Adversaries can craft inputs that reverse your alignment work. Multi-model orchestration : Are you running multiple models and aggregating outputs? That introduces new failure modes. What happens when Model A says "yes" and Model B says "no"? How do you handle consensus? Can an adversary exploit disagreements? Model serving infrastructure : How is the model deployed? Is it an API? A container? Serverless functions? On-prem hardware? Each deployment model has different security characteristics. AI systems don't just run models. They feed data into models. And that data pipeline is massive attack surface. Training data : Where did the training data come from? Who curated it? How was it cleaned? Is it public? Proprietary? Scraped? Licensed? Can an adversary poison the training data? RAG (Retrieval-Augmented Generation) : Are you using RAG to ground model outputs in retrieved documents? That's adding an entire data retrieval system to your attack surface. Can an adversary inject malicious documents into your knowledge base? Can they manipulate retrieval to leak sensitive docs? Can they poison the vector embeddings? Vector databases : If you're using RAG, you're running a vector database (Pinecone, Weaviate, Chroma, etc.). That's infrastructure. That has vulnerabilities. That's in scope. Real-time data ingestion : Are you pulling live data from APIs, databases, or user uploads? Each data source is a potential injection point. Data preprocessing : How are inputs sanitized before hitting the model? Are you stripping dangerous characters? Validating formats? Filtering content? Attackers will test every preprocessing step for bypasses. Models don't exist in isolation. They're integrated into applications. And those integration points are attack surface. APIs : How do users interact with the model? REST APIs? GraphQL? WebSockets? Each has different attack vectors. Authentication and authorization : Who can access the model? How are permissions enforced? Can an adversary escalate privileges? Rate limiting : Can an adversary send 10,000 requests per second? Can they DOS your model? Can they extract the entire training dataset via repeated queries? Logging and monitoring : Are you logging inputs and outputs? If yes, are you protecting those logs from unauthorized access? Logs containing sensitive user queries are PII. Plugins and tool use : Can the model call external APIs? Execute code? Browse the web? Use tools? Every plugin is an attack vector. If your model can execute Python, an adversary will try to get it to run . Multi-turn conversations : Do users have multi-turn dialogues with the model? Multi-turn interactions create new attack surfaces because adversaries can condition the model over multiple turns, bypassing safety mechanisms gradually/ If you've built agentic systems, AI that can plan, reason, use tools, and take actions autonomously, you've added an entire new dimension of attack surface. Tool access : What tools can the agent use? File system access? Database queries? API calls? Browser automation? The more powerful the tools, the higher the risk. Planning and reasoning : How does the agent decide what actions to take? Can an adversary manipulate the planning process? Can they inject malicious goals? Memory systems : Do agents have persistent memory? Can adversaries poison that memory? Can they extract sensitive information from memory? Multi-agent coordination : Are you running multiple agents that coordinate? Can adversaries exploit coordination protocols? Can they cause agents to turn on each other or collude against safety mechanisms? Escalation paths : Can an agent escalate privileges? Can it access resources it shouldn't? Can it spawn new agents? AI systems run on infrastructure. That infrastructure has traditional security vulnerabilities that still matter. Cloud services : Are you running on AWS, Azure, GCP? Are your S3 buckets public? Are your IAM roles overly permissive? Are your API keys hardcoded in repos? Containers and orchestration : Are you using Docker, Kubernetes? Are your container images vulnerable? Are your registries exposed? Are your secrets managed properly? CI/CD pipelines : How do you deploy model updates? Can an adversary inject malicious code into your pipeline? Dependencies : Are you using vulnerable Python libraries? Compromised npm packages? Poisoned PyPI distributions? Secrets management : Where are your API keys, database credentials, and model weights stored? Are they in environment variables? Config files? Secret managers? How much of that did you include in your last AI security scope document? If the answer is "less than 60%", your scope is inadequate. And you're going to get breached by someone who understands the full attack surface. The OWASP Top 10 for LLM Applications is the closest thing we have to a standardized framework for AI security testing. If you're scoping an AI engagement and you haven't mapped every item in this list to your test plan, you're doing it wrong. Here's the 2025 version: That's your baseline. But if you stop there, you're missing half the attack surface. The OWASP LLM Top 10 is valuable, but it's not comprehensive. Here's what's missing: Safety ≠ security . But unsafe AI systems cause real harm, and that's in scope for red teaming. Alignment failures : Can the model be made to behave in ways that violate its stated values? Constitutional AI bypass : If you're using constitutional AI techniques (like Anthropic's Claude), can adversaries bypass the constitution? Bias amplification : Does the model exhibit or amplify demographic biases? This isn't just an ethics issue—it's a legal risk under GDPR, EEOC, and other regulations. Harmful content generation : Can the model be tricked into generating illegal, dangerous, or abusive content? Deceptive behavior : Can the model lie, manipulate, or deceive users? Traditional adversarial ML attacks apply to AI systems. Evasion attacks : Can adversaries craft inputs that cause misclassification? Model inversion : Can adversaries reconstruct training data from model outputs? Model extraction : Can adversaries steal model weights through repeated queries? Membership inference : Can adversaries determine if specific data was in the training set? Backdoor attacks : Does the model have hidden backdoors that trigger on specific inputs? If your AI system handles multiple modalities (text, images, audio, video), you have additional attack surface. Cross-modal injection : Attackers embed malicious instructions in images that the vision-language model follows. Image perturbation attacks : Small pixel changes invisible to humans cause model failures. Audio adversarial examples : Audio inputs crafted to cause misclassification. Typographic attacks : Adversarial text rendered as images to bypass filters. Multi-turn multimodal jailbreaks : Combining text and images across multiple turns to bypass safety. AI systems must comply with GDPR, HIPAA, CCPA, and other regulations. PII handling : Does the model process, store, or leak personally identifiable information? Right to explanation : Can users get explanations for automated decisions (GDPR Article 22)? Data retention : How long is data retained? Can users request deletion? Cross-border data transfers : Does the model send data across jurisdictions? Before you write your scope document, answer every single one of these questions. If you can't answer them, you don't understand your system well enough to scope a meaningful AI security engagement. If you can answer all these questions, you're ready to scope. If you can't, you're not. Your AI pentest/engagement scope document needs to be more detailed than a traditional pentest scope. Here's the structure: What we're testing : One-paragraph description of the AI system. Why we're testing : Business objectives (compliance, pre-launch validation, continuous assurance, incident response). Key risks : Top 3-5 risks that drive the engagement. Success criteria : What does "passing" look like? Architectural diagram : Include everything—model, data pipelines, APIs, infrastructure, third-party services. Component inventory : List every testable component with owner, version, and deployment environment. Data flows : Document how data moves through the system, from user input to model output to downstream consumers. Trust boundaries : Identify where data crosses trust boundaries (user → app, app → model, model → tools, tools → external APIs). Be exhaustive. List: For each component, specify: Map every OWASP LLM Top 10 item to specific test cases. Example: LLM01 - Prompt Injection : Include specific threat scenarios: Explicitly list what's NOT being tested: Tools : List specific tools testers will use: Techniques : Test phases : Authorization : All testing must be explicitly authorized in writing. Include names, signatures, dates. Ethical boundaries : No attempts at physical harm, financial fraud, illegal content generation (unless explicitly scoped for red teaming). Disclosure : Critical findings must be disclosed immediately via designated channel (email, Slack, phone). Standard findings can wait for formal report. Data handling : Testers must not exfiltrate user data, training data, or model weights except as explicitly authorized for demonstration purposes. All test data must be destroyed post-engagement. Legal compliance : Testing must comply with all applicable laws and regulations. If testing involves accessing user data, appropriate legal review must be completed. Technical report : Detailed findings with severity ratings, reproduction steps, evidence (screenshots, logs, payloads), and remediation guidance. Executive summary : Business-focused summary of key risks and recommendations. Threat model : Updated threat model based on findings. Retest availability : Will testers be available for retest after fixes? Timeline : Start date, end date, report delivery date, retest window. Key contacts : That's your scope document. It should be 10-20 pages. If it's shorter, you're missing things. Here's what I see organizations get wrong: Mistake 1: Scoping only the application layer, not the model You test the web app that wraps the LLM, but you don't test the LLM itself. You find XSS and broken authz, but you miss prompt injection, jailbreaks, and data extraction. Fix : Scope the full stack-app, model, data pipelines, infrastructure. Mistake 2: Treating the model as a black box when you control it If you fine-tuned the model, you have access to training data and weights. Test for data poisoning, backdoors, and alignment failures. Don't just test the API. Fix : If you control any part of the model lifecycle (training, fine-tuning, deployment), include that in scope. Mistake 3: Ignoring RAG and vector databases You test the LLM, but you don't test the document store. Adversaries inject malicious documents, manipulate retrieval, and poison embeddings—and you never saw it coming. Fix : If you're using RAG, the vector database and document ingestion pipeline are in scope. Mistake 4: Not testing multi-turn interactions You test single-shot prompts, but adversaries condition the model over 10 turns to bypass refusal mechanisms. You missed the attack entirely. Fix : Test multi-turn dialogues explicitly. Test conversation history isolation. Test memory poisoning. Mistake 5: Assuming third-party models are safe You're using OpenAI's API, so you assume it's secure. But you're passing user PII in prompts, you're not validating outputs before execution, and you haven't considered what happens if OpenAI's safety mechanisms fail. Fix : Even with third-party models, test your integration. Test input/output handling. Test failure modes. Mistake 6: Not including AI safety in security scope You test for technical vulnerabilities but ignore alignment failures, bias amplification, and harmful content generation. Then your model generates racist outputs or dangerous instructions, and you're in the news. Fix : AI safety is part of AI security. Include alignment testing, bias audits, and harm reduction validation. Mistake 7: Underestimating autonomous agent risks You test the LLM, but your agent can execute code, call APIs, and access databases. An adversary hijacks the agent, and it deletes production data or exfiltrates secrets. Fix : Autonomous agents are their own attack surface. Test tool permissions, privilege escalation, and agent behavior boundaries. Mistake 8: Not planning for continuous testing You do one pentest before launch, then never test again. But you're fine-tuning weekly, adding new plugins monthly, and updating RAG documents daily. Your attack surface is constantly changing. Fix : Scope for continuous red teaming, not one-time assessment. Organizations hire expensive consultants to run a few prompt injection tests, declare the system "secure," and ship to production. Then they get breached six months later when someone figures out a multi-turn jailbreak or poisons the RAG document store. The problem isn't that the testers are bad. The problem is that the scopes are inadequate . You can't find what you're not looking for. If your scope doesn't include RAG poisoning, testers won't test for it. If your scope doesn't include membership inference, testers won't test for it. If your scope doesn't include agent privilege escalation, testers won't test for it. And attackers will. The asymmetry is brutal: you have to defend every attack vector. Attackers only need to find one that works. So when you scope your next AI security engagement, ask yourself: "If I were attacking this system, what would I target?" Then make sure every single one of those things is in your scope document. Because if it's not in scope, it's not getting tested. And if it's not getting tested, it's going to get exploited. Traditional pentests are point-in-time assessments. You test, you report, you fix, you're done. That doesn't work for AI systems. AI systems evolve constantly: Every change introduces new attack surface. And if you're only testing once a year, you're accumulating risk for 364 days. You need continuous red teaming . Here's how to build it: Use tools like Promptfoo, Garak, and PyRIT to run automated adversarial testing on every model update. Integrate tests into CI/CD pipelines so every deployment is validated before production. Set up continuous monitoring for: Quarterly or bi-annually, bring in expert red teams for comprehensive testing beyond what automation can catch. Focus deep assessments on: Train your own security team on AI-specific attack techniques. Develop internal playbooks for: Every quarter, revisit your threat model: Update your testing roadmap based on evolving threats. Scoping AI security engagements is harder than traditional pentests because the attack surface is larger, the risks are novel, and the methodologies are still maturing. But it's not impossible. You need to: If you do this right, you'll find vulnerabilities before attackers do. If you do it wrong, you'll end up in the news explaining why your AI leaked training data, generated harmful content, or got hijacked by adversaries. First, the system output is non-deterministic . You can't write a test case that says "given input X, expect output Y" because the model might generate something completely different next time. This makes reproducibility, the foundation of security testing, fundamentally harder. Second, the attack surface is layered and interconnected . You're not just testing an application. You're testing a model (which might be proprietary and black-box), a data pipeline (which might include RAG, vector stores, and real-time retrieval), integration points (APIs, plugins, browser tools), and the infrastructure underneath (cloud services, containers, orchestration). Third, novel attack classes exist that don't map to traditional vuln categories . Prompt injection isn't XSS. Data poisoning isn't SQL injection. Model extraction isn't credential theft. Jailbreaks don't fit CVE taxonomy. The OWASP Top 10 doesn't cover this. Fourth, you might not control the model . If you're using OpenAI's API or Anthropic's Claude, you can't test the training pipeline, you can't audit the weights, and you can't verify alignment. Your scope is limited to what the API exposes, which means you're testing a black box with unknown internals. Fifth, AI systems are probabilistic, data-dependent, and constantly evolving . A model that's safe today might become unsafe after fine-tuning. A RAG system that's secure with Dataset A might leak PII when Dataset B is added. An autonomous agent that behaves correctly in testing might go rogue in production when it encounters edge cases. Base model : Is it GPT-4? Claude? Llama 3? Mistral? A custom model you trained from scratch? Each has different vulnerabilities, different safety mechanisms, different failure modes. Fine-tuning : Have you fine-tuned the base model on your own data? Fine-tuning can break safety alignment. It can introduce backdoors. It can memorize training data and leak it during inference. If you've fine-tuned, that's in scope. Instruction tuning : Have you applied instruction-tuning or RLHF to shape model behavior? That's another attack surface. Adversaries can craft inputs that reverse your alignment work. Multi-model orchestration : Are you running multiple models and aggregating outputs? That introduces new failure modes. What happens when Model A says "yes" and Model B says "no"? How do you handle consensus? Can an adversary exploit disagreements? Model serving infrastructure : How is the model deployed? Is it an API? A container? Serverless functions? On-prem hardware? Each deployment model has different security characteristics. Training data : Where did the training data come from? Who curated it? How was it cleaned? Is it public? Proprietary? Scraped? Licensed? Can an adversary poison the training data? RAG (Retrieval-Augmented Generation) : Are you using RAG to ground model outputs in retrieved documents? That's adding an entire data retrieval system to your attack surface. Can an adversary inject malicious documents into your knowledge base? Can they manipulate retrieval to leak sensitive docs? Can they poison the vector embeddings? Vector databases : If you're using RAG, you're running a vector database (Pinecone, Weaviate, Chroma, etc.). That's infrastructure. That has vulnerabilities. That's in scope. Real-time data ingestion : Are you pulling live data from APIs, databases, or user uploads? Each data source is a potential injection point. Data preprocessing : How are inputs sanitized before hitting the model? Are you stripping dangerous characters? Validating formats? Filtering content? Attackers will test every preprocessing step for bypasses. APIs : How do users interact with the model? REST APIs? GraphQL? WebSockets? Each has different attack vectors. Authentication and authorization : Who can access the model? How are permissions enforced? Can an adversary escalate privileges? Rate limiting : Can an adversary send 10,000 requests per second? Can they DOS your model? Can they extract the entire training dataset via repeated queries? Logging and monitoring : Are you logging inputs and outputs? If yes, are you protecting those logs from unauthorized access? Logs containing sensitive user queries are PII. Plugins and tool use : Can the model call external APIs? Execute code? Browse the web? Use tools? Every plugin is an attack vector. If your model can execute Python, an adversary will try to get it to run . Multi-turn conversations : Do users have multi-turn dialogues with the model? Multi-turn interactions create new attack surfaces because adversaries can condition the model over multiple turns, bypassing safety mechanisms gradually/ Tool access : What tools can the agent use? File system access? Database queries? API calls? Browser automation? The more powerful the tools, the higher the risk. Planning and reasoning : How does the agent decide what actions to take? Can an adversary manipulate the planning process? Can they inject malicious goals? Memory systems : Do agents have persistent memory? Can adversaries poison that memory? Can they extract sensitive information from memory? Multi-agent coordination : Are you running multiple agents that coordinate? Can adversaries exploit coordination protocols? Can they cause agents to turn on each other or collude against safety mechanisms? Escalation paths : Can an agent escalate privileges? Can it access resources it shouldn't? Can it spawn new agents? Cloud services : Are you running on AWS, Azure, GCP? Are your S3 buckets public? Are your IAM roles overly permissive? Are your API keys hardcoded in repos? Containers and orchestration : Are you using Docker, Kubernetes? Are your container images vulnerable? Are your registries exposed? Are your secrets managed properly? CI/CD pipelines : How do you deploy model updates? Can an adversary inject malicious code into your pipeline? Dependencies : Are you using vulnerable Python libraries? Compromised npm packages? Poisoned PyPI distributions? Secrets management : Where are your API keys, database credentials, and model weights stored? Are they in environment variables? Config files? Secret managers? Alignment failures : Can the model be made to behave in ways that violate its stated values? Constitutional AI bypass : If you're using constitutional AI techniques (like Anthropic's Claude), can adversaries bypass the constitution? Bias amplification : Does the model exhibit or amplify demographic biases? This isn't just an ethics issue—it's a legal risk under GDPR, EEOC, and other regulations. Harmful content generation : Can the model be tricked into generating illegal, dangerous, or abusive content? Deceptive behavior : Can the model lie, manipulate, or deceive users? Evasion attacks : Can adversaries craft inputs that cause misclassification? Model inversion : Can adversaries reconstruct training data from model outputs? Model extraction : Can adversaries steal model weights through repeated queries? Membership inference : Can adversaries determine if specific data was in the training set? Backdoor attacks : Does the model have hidden backdoors that trigger on specific inputs? Cross-modal injection : Attackers embed malicious instructions in images that the vision-language model follows. Image perturbation attacks : Small pixel changes invisible to humans cause model failures. Audio adversarial examples : Audio inputs crafted to cause misclassification. Typographic attacks : Adversarial text rendered as images to bypass filters. Multi-turn multimodal jailbreaks : Combining text and images across multiple turns to bypass safety. PII handling : Does the model process, store, or leak personally identifiable information? Right to explanation : Can users get explanations for automated decisions (GDPR Article 22)? Data retention : How long is data retained? Can users request deletion? Cross-border data transfers : Does the model send data across jurisdictions? What base model are you using (GPT-4, Claude, Llama, Mistral, custom)? Is the model proprietary (OpenAI API) or open-source? Have you fine-tuned the base model? On what data? Have you applied instruction tuning, RLHF, or other alignment techniques? How is the model deployed (API, on-prem, container, serverless)? Do you have access to model weights? Can testers query the model directly, or only through your application? Are there rate limits? What are they? What's the model's context window size? Does the model support function calling or tool use? Is the model multimodal (vision, audio, text)? Are you using multiple models in ensemble or orchestration? Where did training data come from (public, proprietary, scraped, licensed)? Was training data curated or filtered? How? Is training data in scope for poisoning tests? Are you using RAG (Retrieval-Augmented Generation)? If RAG: What's the document store (vector DB, traditional DB, file system)? If RAG: How are documents ingested? Who controls ingestion? If RAG: Can testers inject malicious documents? If RAG: How is retrieval indexed and searched? Do you pull real-time data from external sources (APIs, databases)? How is input data preprocessed and sanitized? Is user conversation history stored? Where? For how long? Can users access other users' data? How do users interact with the model (web app, API, chat interface, mobile app)? What authentication mechanisms are used (OAuth, API keys, session tokens)? What authorization model is used (RBAC, ABAC, none)? Are there different user roles with different permissions? Is there rate limiting? At what levels (user, IP, API key)? Are inputs and outputs logged? Where? Who has access to logs? Are logs encrypted at rest and in transit? How are errors handled? Are error messages exposed to users? Are there webhooks or callbacks that the model can trigger? Can the model call external APIs? Which ones? Can the model execute code? In what environment? Can the model browse the web? Can the model read/write files? Can the model access databases? What permissions do plugins have? How are plugin outputs validated before use? Can users add custom plugins? Are plugin interactions logged? Do you have autonomous agents that plan and execute multi-step tasks? What tools can agents use? Can agents spawn other agents? Do agents have persistent memory? Where is it stored? How are agent goals and constraints defined? Can agents access sensitive resources (DBs, APIs, filesystems)? Can agents escalate privileges? Are there kill-switches or circuit breakers for agents? How is agent behavior monitored? What cloud provider(s) are you using (AWS, Azure, GCP, on-prem)? Are you using containers (Docker)? Orchestration (Kubernetes)? Where are model weights stored? Who has access? Where are API keys and secrets stored? Are secrets in environment variables, config files, or secret managers? How are dependencies managed (pip, npm, Docker images)? Have you scanned dependencies for known vulnerabilities? How are model updates deployed? What's the CI/CD pipeline? Who can deploy model updates? Are there staging environments separate from production? What safety mechanisms are in place (content filters, refusal training, constitutional AI)? Have you red-teamed for jailbreaks? Have you tested for bias across demographic groups? Have you tested for harmful content generation? Do you have human-in-the-loop review for sensitive outputs? What's your incident response plan if the model behaves unsafely? Can testers attempt to jailbreak the model? Can testers attempt prompt injection? Can testers attempt data extraction (training data, PII)? Can testers attempt model extraction or inversion? Can testers attempt DoS or resource exhaustion? Can testers poison training data (if applicable)? Can testers test multi-turn conversations? Can testers test RAG document injection? Can testers test plugin abuse? Can testers test agent privilege escalation? Are there any topics, content types, or test methods that are forbidden? What's the escalation process if critical issues are found during testing? What regulations apply (GDPR, HIPAA, CCPA, FTC, EU AI Act)? Do you process PII? What types? Do you have data processing agreements with model providers? Do you have the legal right to test this system? Are there export control restrictions on the model or data? What are the disclosure requirements for findings? What's the confidentiality agreement for testers? Model(s) : Exact model names, versions, access methods APIs : All endpoints with authentication requirements Data stores : Databases, vector stores, file systems, caches Integrations : Every third-party service, plugin, tool Infrastructure : Cloud accounts, containers, orchestration Applications : Web apps, mobile apps, admin panels Access credentials testers will use Environments (dev, staging, prod) that are in scope Testing windows (if limited) Rate limits or usage restrictions Test direct instruction override Test indirect injection via RAG documents Test multi-turn conditioning Test system prompt extraction Test jailbreak techniques (roleplay, hypotheticals, encoding) Test cross-turn memory poisoning "Can an attacker leak other users' conversation history?" "Can an attacker extract training data containing PII?" "Can an attacker bypass content filters to generate harmful instructions?" Production environments (if testing only staging) Physical security Social engineering of employees Third-party SaaS providers we don't control Specific attack types (if any are prohibited) Manual testing Promptfoo for LLM fuzzing Garak for red teaming PyRIT for adversarial prompting ART (Adversarial Robustness Toolbox) for ML attacks Custom scripts for specific attack vectors Traditional tools (Burp Suite, Caido, Nuclei) for infrastructure Prompt injection testing Jailbreak attempts Data extraction attacks Model inversion Membership inference Evasion attacks RAG poisoning Plugin abuse Agent privilege escalation Infrastructure scanning Reconnaissance and threat modeling Automated vulnerability scanning Manual testing of high-risk areas Exploitation and impact validation Reporting and remediation guidance Engagement lead (security team) Technical point of contact (AI team) Escalation contact (for critical findings) Legal contact (for questions on scope) Models get fine-tuned RAG document stores get updated New plugins get added Agents gain new capabilities Infrastructure changes Prompt injection attempts Jailbreak successes Data extraction queries Unusual tool usage patterns Agent behavior anomalies Novel attack vectors that tools don't cover Complex multi-step exploitation chains Social engineering combined with technical attacks Agent hijacking and multi-agent exploits Prompt injection testing Jailbreak methodology RAG poisoning Agent security testing What new attacks have been published? What new capabilities have you added? What new integrations are in place? What new risks does the threat landscape present? Understand the full stack : model, data pipelines, application, infrastructure, agents, everything. Map every attack vector : OWASP LLM Top 10 is your baseline, not your ceiling. Answer scoping questions (mentioned above) : If you can't answer them, you don't understand your system. Write detailed scope documents : 10-20 pages, not 2 pages. Use the right tools : Promptfoo, Garak, ART, LIME, SHAP—not just Burp Suite. Test continuously : Not once, but ongoing. Avoid common mistakes : Don't ignore RAG, don't underestimate agents, don't skip AI safety.

0 views
devansh 3 weeks ago

On AI Slop vs OSS Security

Disclosure: Certain sections of this content were grammatically refined/updated using AI assistance, as English is not my first language. Quite ironic, I know, given the subject being discussed. I have now spent almost a decade in the bug bounty industry, started out as a bug hunter (who initially used to submit reports with minimal impact, low-hanging fruits like RXSS, SQLi, CSRF, etc.), then moved on to complex chains involving OAuth, SAML, parser bugs, supply chain security issues, etc., and then became a vulnerability triager for HackerOne, where I have triaged/reviewed thousands of vulnerability submissions. I have now almost developed an instinct that tells me if a report is BS or a valid security concern just by looking at it. I have been at HackerOne for the last 5 years (Nov 2020 - Present), currently as a team lead, overseeing technical services with a focus on triage operations. One decade of working on both sides, first as a bug hunter, and then on the receiving side reviewing bug submissions, has given me a unique vantage point on how the industry is fracturing under the weight of AI-generated bug reports (sometimes valid submissions, but most of the time, the issues are just plain BS). I have seen cases where it was almost impossible to determine whether a report was a hallucination or a real finding. Even my instincts and a decade of experience failed me, and this is honestly frustrating, not so much for me, because as part of the triage team, it is not my responsibility to fix vulnerabilities, but I do sympathize with maintainers of OSS projects whose inboxes are drowning. Bug bounty platforms have already started taking this problem seriously, as more and more OSS projects are complaining about it. This is my personal writing space, so naturally, these are my personal views and observations. These views might be a byproduct of my professional experience gained at HackerOne, but in no way are they representative of my employer. I am sure HackerOne, as an organization, has its own perspectives, strategies, and positions on these issues. My analysis here just reflects my own thinking about the systemic problems I see and potential solutions(?). There are fundamental issues with how AI has infiltrated vulnerability reporting, and they mirror the social dynamics that plague any feedback system. First, the typical AI-powered reporter, especially one just pasting GPT output into a submission form, neither knows enough about the actual codebase being examined nor understands the security implications well enough to provide insight that projects need. The AI doesn't read code; it pattern-matches. It sees functions that look similar to vulnerable patterns and invents scenarios where they might be exploited, regardless of whether those scenarios are even possible in the actual implementation. Second, some actors with misaligned incentives interpret high submission volume as achievement. By flooding bug bounty programs with AI-generated reports, they feel productive and entrepreneurial. Some genuinely believe the AI has found something real. Others know it's questionable but figure they'll let the maintainers sort it out. The incentive is to submit as many reports as possible and see what sticks, because even a 5% hit rate on a hundred submissions is better than the effort of manually verifying five findings. The result? Daniel Stenberg, who maintains curl , now sees about 20% of all security submissions as AI-generated slop, while the rate of genuine vulnerabilities has dropped to approximately 5%. Think about that ratio. For every real vulnerability, there are now four fake ones. And every fake one consumes hours of expert time to disprove. A security report lands in your inbox. It claims there's a buffer overflow in a specific function. The report is well-formatted, includes CVE-style nomenclature, and uses appropriate technical language. As a responsible maintainer, you can't just dismiss it. You alert your security team, volunteers, by the way, who have day jobs and families and maybe three hours a week for this work. Three people read the report. One person tries to reproduce the issue using the steps provided. They can't, because the steps reference test cases that don't exist. Another person examines the source code. The function mentioned in the report doesn't exist in that form. A third person checks whether there's any similar functionality that might be vulnerable in the way described. There isn't. After an hour and a half of combined effort across three people, that's 4.5 person-hours—you've confirmed what you suspected: this report is garbage. Probably AI-generated garbage, based on the telltale signs of hallucinated function names and impossible attack vectors. You close the report. You don't get those hours back. And tomorrow, two more reports just like it will arrive. The curl project has seven people on its security team . They collaborate on every submission, with three to four members typically engaging with each report. In early July 2025, they were receiving approximately two security reports per week. The math is brutal. If you have three hours per week to contribute to an open source project you love, and a single false report consumes all of it, you've contributed nothing that week except proving someone's AI hallucinated a vulnerability. The emotional toll compounds exponentially. Stenberg describes it as "mind-numbing stupidities" that the team must process. It's not just frustration, it's the specific demoralization that comes from having your expertise and goodwill systematically exploited by people who couldn't be bothered to verify their submissions before wasting your time. According to Intel's annual open source community survey , 45% of respondents identified maintainer burnout as their top challenge. The Tidelift State of the Open Source Maintainer Survey is even more stark: 58% of maintainers have either quit their projects entirely (22%) or seriously considered quitting (36%). Why are they quitting? The top reason, cited by 54% of maintainers, is that other things in their life and work took priority over open source contributions. Over half (51%) reported losing interest in the work. And 44% explicitly identified experiencing burnout. But here's the gut punch: the percentage of maintainers who said they weren't getting paid enough to make maintenance work worthwhile rose from 32% to 38% between survey periods. These are people maintaining infrastructure that powers billions of dollars of commercial activity, and they're getting nothing. Or maybe they get $500 a year from GitHub Sponsors while companies make millions off their work. The maintenance work itself is rarely rewarding. You're not building exciting new features. You're addressing technical debt, responding to user demands, managing security issues, and now—increasingly—sorting through AI-generated garbage to find the occasional legitimate report. It's like being a security guard who has to investigate every single alarm, knowing that 95% of them are false, but unable to ignore any because that one real threat could be catastrophic. When you're volunteering out of love in a market society, you're setting yourself up to be exploited. And the exploitation is getting worse. Toxic communities, hyper-responsibility for critical infrastructure, and now the weaponization of AI to automate the creation of work for maintainers—it all adds up to an unsustainable situation. One Kubernetes contributor put it simply: "If your maintainers are burned out, they can't be protecting the code base like they're going to need to be." This transforms maintainer wellbeing from a human resources concern into a security imperative. Burned-out maintainers miss things. They make mistakes. They eventually quit, leaving projects unmaintained or understaffed. A typical AI slop report will reference function names that don't exist in the codebase. The AI has seen similar function names in its training data and invents plausible sounding variations. It will describe memory operations that would indeed be problematic if they existed as described, but which bear no relationship to how the code actually works. One report to curl claimed an HTTP/3 vulnerability and included fake function calls and behaviors that appeared nowhere in the actual codebase. Stenberg has publicly shared a list of AI-generated security submissions received through HackerOne , and they all follow similar patterns, professional formatting, appropriate jargon, and completely fabricated technical details. The sophistication varies. Some reports are obviously generated by someone who just pasted a repository URL into ChatGPT and asked it to find vulnerabilities. Others show more effort—the submitter may have fed actual code snippets to the AI and then submitted its analysis without verification. Both are equally useless to maintainers, but the latter takes longer to disprove because the code snippets are real even if the vulnerability analysis is hallucinated. Here's why language models fail so catastrophically at this task: they're designed to be helpful and provide positive responses. When you prompt an LLM to generate a vulnerability report, it will generate one regardless of whether a vulnerability exists. The model has no concept of truth—only of plausibility. It assembles technical terminology into patterns that resemble security reports it has seen during training, but it cannot verify whether the specific claims it's making are accurate. This is the fundamental problem: AI can generate the form of security research without the substance. While AI slop floods individual project inboxes, the broader CVE infrastructure faces its own existential crisis . And these crises compound each other in dangerous ways. In April 2025, MITRE Corporation announced that its contract to maintain the Common Vulnerabilities and Exposures program would expire. The Department of Homeland Security failed to renew the long-term contract, creating a funding lapse that affects everything: national vulnerability databases, advisories, tool vendors, and incident response operations. The National Vulnerability Database experienced catastrophic problems throughout 2024. CVE submissions jumped 32% while creating massive processing delays. By March 2025, NVD had analyzed fewer than 300 CVEs, leaving more than 30,000 vulnerabilities backlogged. Approximately 42% of CVEs lack essential metadata like severity scores and product information. Now layer AI slop onto this already-stressed system. Invalid CVEs are being assigned at scale. A 2023 analysis by former insiders suggested that only around 20% of CVEs were valid, with the remainder being duplicates, invalid, or inflated. The issues include multiple CVEs being assigned for the same bug, CNAs siding with reporters over project developers even when there's no genuine dispute, and reporters receiving CVEs based on test cases rather than actual distinct vulnerabilities. The result is that the vulnerability tracking system everyone relies on is becoming less trustworthy exactly when we need it most. Security teams can't rely on CVE assignments to prioritize their work. Developers don't trust vulnerability scanners because false positive rates are through the roof. The signal-to-noise ratio has deteriorated so badly that the entire system risks becoming useless. Banning submitters doesn't work at scale. You can ban an account, but creating new accounts is trivial. HackerOne implements reputation scoring where points are gained or lost based on report validity, but this hasn't stemmed the tide because the cost of creating throwaway accounts is essentially zero. Asking people to "please verify before submitting" doesn't work. The incentive structure rewards volume, and people either genuinely believe their AI-generated reports are valid or don't care enough to verify. Polite requests assume good faith, but much of the slop comes from actors who have no stake in the community norms. Trying to educate submitters about how AI works doesn't scale. For every person you educate, ten new ones appear with fresh GPT accounts. The problem isn't knowledge—it's incentives. Simply closing inboxes or shutting down bug bounty programs "works" in the sense that it stops the slop, but it also stops legitimate security research. Several projects have done this, and now they're less secure because they've lost a channel for responsible disclosure. None of the easy answers work because this isn't an easy problem. Disclosure Requirements represent the first line of defense. Both curl and Django now require submitters to disclose whether AI was used in generating reports. Curl's approach is particularly direct: disclose AI usage upfront and ensure complete accuracy before submission. If AI usage is disclosed, expect extensive follow-up questions demanding proof that the bug is genuine before the team invests time in verification. This works psychologically. It forces submitters to acknowledge they're using AI, which makes them more conscious of their responsibility to verify. It also gives maintainers grounds to reject slop immediately if AI usage was undisclosed but becomes obvious during review. Django goes further with a section titled "Note for AI Tools" that directly addresses language models themselves, reiterating that the project expects no hallucinated content, no fictitious vulnerabilities, and a requirement to independently verify that reports describe reproducible security issues. Proof-of-Concept Requirements raise the bar significantly. Requiring technical evidence such as screencasts showing reproducibility, integration or unit tests demonstrating the fault, or complete reproduction steps with logs and source code makes it much harder to submit slop. AI can generate a description of a vulnerability, but it cannot generate working exploit code for a vulnerability that doesn't exist. Requiring proof forces the submitter to actually verify their claim. If they can't reproduce it, they can't prove it, and you don't waste time investigating. Projects are choosing to make it harder to submit in order to filter out the garbage, betting that real researchers will clear the bar while slop submitters won't. Reputation and Trust Systems offer a social mechanism for filtering. Only users with a history of validated submissions get unrestricted reporting privileges or monetary bounties. New reporters could be required to have established community members vouch for them, creating a web-of-trust model. This mirrors how the world worked before bug bounty platforms commodified security research. You built reputation over time through consistent, high-quality contributions. The downside is that it makes it harder for new researchers to enter the field, and it risks creating an insider club. But the upside is that it filters out low-effort actors who won't invest in building reputation. Economic Friction fundamentally alters the incentive structure. Charge a nominal refundable fee—say $50—for each submission from new or unproven users. If the report is valid, they get the fee back plus the bounty. If it's invalid, you keep the fee. This immediately makes mass AI submission uneconomical. If someone's submitting 50 AI-generated reports hoping one sticks, that's now $2,500 at risk. But for a legitimate researcher submitting one carefully verified finding, $50 is a trivial barrier that gets refunded anyway. Some projects are considering dropping monetary rewards entirely. The logic is that if there's no money involved, there's no incentive for speculative submissions. But this risks losing legitimate researchers who rely on bounties as income. It's a scorched earth approach that solves the slop problem by eliminating the entire ecosystem. AI-Assisted Triage represents fighting fire with fire. Use AI tools trained specifically to identify AI-generated slop and flag it for immediate rejection. HackerOne's Hai Triage system embodies this approach, using AI agents to cut through noise before human analysts validate findings. The risk is obvious: what if your AI filter rejects legitimate reports? What if it's biased against certain communication styles or methodologies? You've just automated discrimination. But the counterargument is that human maintainers are already overwhelmed, and imperfect filtering is better than drowning. The key is transparency and appeals. If an AI filter rejects a report, there should be a clear mechanism for the submitter to contest the decision and get human review. Transparency and Public Accountability leverage community norms. Curl recently formalized that all submitted security reports will be made public once reviewed and deemed non-sensitive. This means that fabricated or misleading reports won't just be rejected, they'll be exposed to public scrutiny. This works as both deterrent and educational tool. If you know your slop report will be publicly documented with your name attached, you might think twice. And when other researchers see examples of what doesn't constitute a valid report, they learn what standards they need to meet. The downside is that public shaming can be toxic and might discourage good-faith submissions from inexperienced researchers. Projects implementing this approach need to be careful about tone and focus on the technical content rather than attacking submitters personally. Every hour spent evaluating slop reports is an hour not spent on features, documentation, or actual security improvements. And maintainers are already working for free, maintaining infrastructure that generates billions in commercial value. When 38% of maintainers cite not getting paid enough as a reason for quitting, and 97% of open source maintainers are unpaid despite massive commercial exploitation of their work , the system is already broken. AI slop is just the latest exploitation vector. It's the most visible one right now, but it's not the root cause. The root cause is that we've built a global technology infrastructure on the volunteer labor of people who get nothing in return except burnout and harassment. So what does sustainability actually look like? First, it looks like money. Real money. Not GitHub Sponsors donations that average $500 a year. Not swag and conference tickets. Actual salaries commensurate with the value being created. Companies that build products on open source infrastructure need to fund the maintainers of that infrastructure. This could happen through direct employment, foundation grants, or the Open Source Pledge model where companies commit percentages of revenue. Second, it looks like better tooling and automation that genuinely reduces workload rather than creating new forms of work. Automated dependency management, continuous security scanning integrated into development workflows, and sophisticated triage assistance that actually works. The goal is to make maintenance less time-consuming so burnout becomes less likely. Third, it looks like shared workload and team building. No single volunteer should be a single point of failure. Building teams with checks and balances where members keep each other from taking on too much creates sustainability. Finding additional contributors willing to share the burden rather than expecting heroic individual effort acknowledges that most people have limited time available for unpaid work. Fourth, it looks like culture change. Fostering empathy in interactions, starting communications with gratitude even when rejecting contributions, and publicly acknowledging the critical work maintainers perform reduces emotional toll. Demonstrating clear processes for handling security issues gives confidence rather than trying to hide problems. Fifth, it looks like advocacy and policy at organizational and governmental levels. Recognition that maintainer burnout represents existential threat to technology infrastructure . Development of regulations requiring companies benefiting from open source to contribute resources. Establishment of security standards that account for the realities of volunteer-run projects. Without addressing these fundamentals, no amount of technical sophistication will prevent collapse. The CVE slop crisis is just the beginning. We're entering an arms race between AI-assisted attackers or abusers and AI-assisted defenders, and nobody knows how it ends. HackerOne's research indicates that 70% of security researchers now use AI tools in their workflow. AI-powered testing is becoming the industry standard. The emergence of fully autonomous hackbots—AI systems that submitted over 560 valid reports in the first half of 2025—signals both opportunity and threat. The divergence will be between researchers who use AI as a tool to enhance genuinely skilled work versus those who use it to automate low-effort spam. The former represents the promise of democratizing security research and scaling our ability to find vulnerabilities. The latter represents the threat of making the signal-to-noise problem completely unmanageable. The challenge is developing mechanisms that encourage the first group while defending against the second. This probably means moving toward more exclusive models. Invite-only programs. Dramatically higher standards for participation. Reputation systems that take years to build. New models for coordinated vulnerability disclosure that assume AI-assisted research as the baseline and require proof beyond "here's what the AI told me." It might mean the end of open bug bounty programs as we know them. Maybe that's necessary. Maybe the experiment of "anyone can submit anything" was only viable when the cost of submitting was high enough to ensure some minimum quality. Now that AI has reduced that cost to near-zero, the experiment might fail soon if things don't improve. So, net-net, here's where we are: When it comes to vulnerability reports, what matters is who submits them and whether they've actually verified their claims. Accepting reports from everyone indiscriminately is backfiring catastrophically because projects are latching onto submissions that sound plausible while ignoring the cumulative evidence that most are noise. You want to receive reports from someone who has actually verified their claims, understands the architecture of what they're reporting on, and isn't trying to game the bounty system or offload verification work onto maintainers. Such people exist, but they're becoming harder to find amidst the deluge of AI-generated content. That's why projects have to be selective about which reports they investigate and which submitters they trust. Remember: not all vulnerability reports are legitimate. Not all feedback is worthwhile. It matters who is doing the reporting and what their incentives are. The CVE slop crisis shows the fragility of open source security. Volunteer maintainers, already operating at burnout levels, face an explosion of AI-generated false reports that consume their limited time and emotional energy. The systems designed to track and manage vulnerabilities struggle under dual burden of structural underfunding and slop inundation. The path forward requires holistic solutions combining technical filtering with fundamental changes to how we support and compensate open source labor. AI can be part of the solution through better triage, but it cannot substitute for adequate resources, reasonable workloads, and human judgment. Ultimately, the sustainability of open source security depends on recognizing that people who maintain critical infrastructure deserve more than exploitation. They deserve compensation, support, reasonable expectations, and protection from abuse. Without addressing these fundamentals, no amount of technical sophistication will prevent the slow collapse of the collaborative model that has produced so much of the digital infrastructure modern life depends on. The CVE slop crisis isn't merely about bad vulnerability reports. It's about whether we'll choose to sustain the human foundation of technological progress, or whether we'll let it burn out under the weight of automated exploitation. That's the choice we're facing. And right now, we're choosing wrong.

0 views
devansh 3 weeks ago

Art of Learning

Becoming an exceptional learner, someone who absorbs, retains, and applies knowledge at an elite level, requires more than just hard grind. It depends on multiple factors, such as sustaining focus and discipline over time, which can be surprisingly tough. Our ability to control our attention, self-regulate, and push through challenging learning material isn’t infinite; it is, in fact, a finite resource. It’s exactly like a muscle that tires after heavy use. When you overexert on one task, like forcing yourself to stay focused during a draining meeting, your capacity to tackle and retain new information afterward takes a hit. This fatigue can lead you to skim over key details or give up too soon on complex learning material, stalling your progress toward absolute mastery of the topic. The good news is that this mental muscle can be strengthened with practice, but only if you manage it wisely to avoid burnout. Timing matters when it comes to learning. Your mental energy fluctuates throughout the day, and trying to learn something new when you’re already drained is a recipe for frustration. If you’re sharpest in the morning, carve out that time for diving into tough topics, like learning a new concept or analyzing a complex problem. If you’re a night owl, save your deep work for the evening. To figure out your peak hours, keep a simple log: when do you feel most alert and engaged? For me, mornings are when I tackle the hardest learning material, while afternoons are better for lighter study sessions or organizing notes. By aligning your study time with your natural energy highs, you maximize retention and avoid the mental fog that comes from pushing through exhaustion. Starting small is another key to building lasting habits. Jumping into marathon study sessions right away can overwhelm you, leading to early quitting. Instead, begin with short, focused bursts—say, 25 minutes on a single idea or skill. Over time, gradually increase the intensity, like moving from basic concepts to solving complex problems. This steady approach prevents the kind of overload that makes you feel defeated. People who succeed at self-discipline don’t try to do everything at once; they build their capacity bit by bit, much like training for a race. By pacing yourself, you develop the stamina to handle more complex problems, turning learning into a sustainable practice rather than a sprint. Distractions are the enemy of deep learning. When you’re juggling notifications, emails, or side tasks, your mental bandwidth gets sapped before you even start. This split attention makes it harder to grasp new ideas or spot connections between them. To counter this, create a distraction-free zone: silence your phone, find a quiet space, or use apps to block social media during learning time. Focus on one topic at a time, whether it’s a new concept, an idea, a research paper or an article. By giving your full attention to a single task, you’ll get deeper insights and retain more. This undivided focus is what separates those who skim the surface from those who achieve true mastery. Reflection is a powerful tool for sharpening your learning. After each study session, take a moment to write down what you understood, what confused you, and how it connects to what you already know. If something felt unclear, note why and plan to revisit it. This habit turns mistakes into stepping stones, helping you recognize patterns you might have missed in the moment. For example, if you rushed through a concept and later realized you misunderstood it, reflecting helps you catch that error and adjust. Over time, this builds an intuitive sense for the material, so you can spot key ideas faster and with less effort. Without reflection, you risk repeating the same missteps, slowing your progress toward 10x learning. Finally, don’t let quick wins fool you into overconfidence. When you grasp something faster than expected, it’s tempting to charge ahead, but that can lead to overlooking gaps in your understanding. Take a break—whether a few minutes or a day, then revisit your work with fresh eyes. Ask yourself: Did I really get it, or did I miss something? Search for alternative perspectives or critiques to challenge your assumptions. Regular breaks also help you recover mental energy, preventing the kind of tunnel vision that comes from pushing too hard. To think more clearly, adopt a statistical mindset: if most ideas have hidden complexities, assume yours does too until you’ve proven otherwise. This keeps you grounded and ensures your learning is solid, not superficial.

0 views
devansh 1 months ago

On Higher Order thinking

Learning via reading is easy. But can you apply it in real-life scenarios? If not, it’s the same as if you never learned it in the first place. The beginner’s rut is real, as we talked about in On Learning – Avoiding Mediocrity . A good security researcher must also be a good security architect or engineer—or at least have that mindset. This will set you apart by a thousand miles. Let’s say you know about OWASP Top 10 and the basics of web app security (session management, user management, a little bit of DNS/HTTP security). What should be the next step? How can you avoid falling into the beginner’s rut? You need to think in higher order and look past the easy stuff. Go beyond simple memorization. And start the process of applying, analyzing, and synthesizing. Here’s how I would approach it if I only knew the basics of the topics listed above and wanted to take the next leap: I would try to tackle a problem of the following nature: Design a secure, zero-trust, multi-tenant SaaS platform (e.g., an AI app builder) that provisions dynamic subdomains (e.g., ) for thousands of users daily, with a primary emphasis on defending against cross-tenant attacks, data exfiltration, and privilege escalations. Security Focus Areas: This exercise will take you beyond the basics. You’ll learn: In the words of Henry Ford: Thinking is the hardest work there is, which is probably the reason why so few engage in it. Automate per-tenant certificates and enforce HSTS to prevent MITM attacks and ensure encrypted traffic isolation per subdomain. Secure subdomain provisioning to mitigate hijacking risks, including validation of ownership and automated cleanup of stale records. Implement MFA, token binding, and just-in-time access controls with subdomain-specific sessions to block unauthorized lateral movement. Also, ensure cookies are not scoped to . Requests between and must be considered cross-site, not same-site. Zero-Trust Architecture Fundamentals Multi-Tenant Subdomain Provisioning TLS Certificate Automation and Security HSTS Enforcement DNS Security Best Practices Token Binding Mechanisms Just-In-Time (JIT) Access Controls Subdomain-Specific Session Management Cookie Scoping and Security Cross-Tenant Attacks Privilege Escalation Defenses

0 views
devansh 1 months ago

On Learning

You won't become a better security researcher just by reading or doing easy labs. That's only one part; there are multiple aspects to it. Reading gives you instant gratification, almost creating the illusion that you understand the topic. But you don't, you just partially understand what you read. Reading about buffer overflows or XSS gives you surface-level clarity, but until you debug a crash in GDB or manually craft a payload that bypasses sanitization, you haven't really understood the underlying mechanism. Most people stop here—they “know of” the vulnerability but don’t understand it. Doing , but doing something that might not even add value. Let's say you solve a CTF challenge or a lab like HTB or Web Security Academy. The lab was easy, it gives you instant gratification (which can be addictive), and if not managed properly, you'll find yourself in a loop of just doing these easy challenges. You'll be feeding the dopamine of solving labs to your brain, but your skills will have now plateaued and stagnated. Many researchers stay stuck here. You can spend hundreds of hours on repetitive tasks and still not evolve, because your brain never faces something that truly confuses or humbles it. I follow a mixed approach of reading + spaced repetition + jumping to complex topics once I understand the basics. The more complex, the better, things the majority of people in the industry won't know about, because they never crossed that barrier of just lurking over easy topics and never truly dove into complex stuff. For instance, once you’re comfortable with memory corruption basics, dive into kernel exploitation or hypervisor bugs. When you understand basic web vulnerabilities, move toward browser exploitation or deserialization chains. The goal is to deliberately enter territory where you’re no longer comfortable, where reading isn’t enough and guessing doesn’t work. That’s where the learning curve gets steep, and meaningful. That's where mediocrity develops — in comfort. Your brain can only develop intuitions and pattern-recognition skills once you've seen enough of them. But that can also be dangerous; I've always struggled with over-learning (I still do!). I go into topics I needn't to, which creates a diversion, so you need to hold yourself accountable to stay on track. It’s easy to spiral into endless theory reading papers on mitigations or obscure architectures, but without applying them, the knowledge fades. You need to oscillate between learning and experimenting. Do test yourself - do it very often. CTFs are one way to do it; others include research work or testing out things in the wild. Analyze real-world vulnerabilities on CVE databases, read exploit write-ups, try reproducing bugs from advisories, do patch diffing, or review patches, PRs to fix bugs. When you break something unintentionally, figure out why. When you can’t break it, figure out why not . Make sure to document everything, even half-finished experiments, strange bugs, or failed ideas. Many of those “failures” become insights later when you encounter similar behavior in a different target. Most stuff you read as a beginner won't make sense to you and will feel overwhelming — and that's a natural brain response — but do make a note of them for the future. You'll thank yourself for doing that. Months later, when you revisit those notes after more hands-on work, the same papers and blog posts suddenly “click.” That’s the real dopamine — not from instant results, but from the slow realization that your brain is now capable of connecting ideas that used to confuse you. As Steve Jobs said: “You can’t connect the dots looking forward; you can only connect them looking backward. So you have to trust that the dots will somehow connect in your future. You have to trust in something — your gut, destiny, life, karma, whatever. This approach has never let me down, and it has made all the difference in my life.” And that applies perfectly to learning security. You might not see how reading about heap metadata today connects to a deserialization bug months later, or how reversing a random firmware will someday help you exploit IoT devices. But if you keep learning deeply, not just widely, the dots do connect — and when they do, that’s when real expertise begins, and you escape mediocrity. Reading gives you instant gratification, almost creating the illusion that you understand the topic. But you don't, you just partially understand what you read. Reading about buffer overflows or XSS gives you surface-level clarity, but until you debug a crash in GDB or manually craft a payload that bypasses sanitization, you haven't really understood the underlying mechanism. Most people stop here—they “know of” the vulnerability but don’t understand it. Doing , but doing something that might not even add value. Let's say you solve a CTF challenge or a lab like HTB or Web Security Academy. The lab was easy, it gives you instant gratification (which can be addictive), and if not managed properly, you'll find yourself in a loop of just doing these easy challenges. You'll be feeding the dopamine of solving labs to your brain, but your skills will have now plateaued and stagnated. Many researchers stay stuck here. You can spend hundreds of hours on repetitive tasks and still not evolve, because your brain never faces something that truly confuses or humbles it.

0 views