Dangerous Errors
Podcast Posts Presentations Synthwave About
Podcast Posts Presentations Synthwave About
  • ASW Recap for May 2026 Jun 5, 2026
    An Enigma machine, used for enciphering messages during World War II

    Photo by Christian Lendl on Unsplash

    The halting problem is a famous example of a decision problem in computing.

    It asserts that, given a piece of software, it’s impossible to know if appsec will ever stop making checklists about it.

    Keeping Up With the OWASP GenAI Project (ep. 381)

    Speed is the most common theme among developers and appsec teams working with LLMs and agents, from trying to keep up with patterns for deploying agents to dealing with more code at a faster pace to how the latest models impact code quality and security. The OWASP GenAI Project is helping organizations keep up with these changes and engaging the appsec community for sharing effective ways to keep systems secure. Scott Clinton shares the latest progress on the project, its roadmap for the year, and how appsec practitioners can shape its future.

    Why Basic Security Practices Still Work (ep. 382)

    If you have to ditch your entire appsec strategy because you expect 2026 to bring more vulns more quickly, then you probably didn’t have a good strategy in the first place. Rob Allen shares how the mentality of “assume breach” doesn’t have to be a defeatist attitude and can instead be a way to change a catastrophic breach into a more contained one. We also talk about proactive security and what an “avoid breach” attitude could look like, including how to apply the macro lessons of default deny and network isolation to writing secure code.

    This was a sponsored interview.

    The State of AI & AppSec (ep. 383)

    This year has seen a growing gap between long-established secure design fundamentals and burgeoning chaos of LLM-driven vuln discovery. Keith Hoodlet returns to share his latest observations on what the recent news about Mythos, models, and harnesses means for appsec. He walks through the problems of misalignment, the potential development doom that looms behind a volume of vulns, and what modern code creation looks like. Along the way we touch on the economics of tokens and the principles behind secure software. Keith gave a preview of his upcoming presentation (May 22nd) on these topics.

    AppSec Conversations on Agents, LLMs, and OWASP from RSAC (ep. 384)

    We showcase recordings from this year's RSAC Conference.

    Scott Clinton, Co-Chair and co-founder of the OWASP GenAI Security Project, shares insights from the project’s latest research, including new landscape guides and evolving approaches to securing generative and agentic AI systems. The conversation explores critical gaps in genAI data security, the rise of agent-assisted development, and the immense growth of the OWASP community and sponsor ecosystem. Looking ahead, he outlines the most urgent risks and priorities shaping AI and agentic security in 2026.

    Then Merritt Maxim discusses how AI is affecting Identity and Access Management. Expect to hear this topic a lot throughout 2026, especially as the industry tries to figure out what’s different or special about securing agent identities.

    We close with a chat with Janet Worthington about the impact of agents on the SDLC and how orgs are updating their controls to deal with code generated by humans and LLMs alike.

  • ASW Recap for April 2026 May 1, 2026
    Golden Gate bridge

    Photo by KEITH WONG on Unsplash

    This tax season, give your org an appsec tax refund.

    Skip that list of phishing terms.

    Remove that password strength calculator.

    Hide that hardening guide.

    And instead,

    Make passkeys palatable,

    Deliver smart defaults,

    Defeat classes of vulns with a good design.

    Like Standard AppSec News, But With AI (ep. 377)

    We started off with a roundup of appsec news. Source code leak, but with AI. Supply chain compromise, but with AI. Better CMS design, but with AI.

    The AI angle is inescapable, but that doesn't mean appsec fundamentals have changed. As always, John Kinsella highlights the interesting bits and adds advice for teams figuring out how to use LLMs in their workflows.

    The axios supply chain compromise was this year's XZ Utils -- a reminder that security still needs to work on making it easier to deploy known solutions and that many of those solutions should be expected as the default state for modern software development.

    I try to curate the articles we cover each week along a common theme for discussion.

    The theme this week was my (unattainable?) wish to see appsec use vuln discovery as a motivation for building secure software that avoids classes of vulns. For example, if some code has a SQL query built with string concatenation, why not enforce a coding style and policy that requires only parameterized queries? It seems like we still wait for grep, a fuzzer, or an agent to find such patterns instead of avoiding them in the first place.

    Securing Software's Journey with the OWASP SPVS (ep. 378)

    It’s one thing to write secure code, it’s another to release it into the wild. It's yet another thing entirely to run someone else’s code on your systems.

    Farshad Abasi and Cameron Walters created the OWASP SPVS (Secure Pipeline Verification Standard) to organize the steps and processes needed to establish a secure ecosystem for building, releasing, and maintaining software. They explain how it complements other guidance like ASVS, which focuses on the lifecycle of a specific app, and SLSA, which offers similar levels of controls for creating and consuming software artifacts.

    They also explain why they went with a full project instead of creating yet one more top 10 list (thank you), why this 1.5 version bump gained over 130 new controls because of AI (whaaat!?), and how to implement effective controls without being overwhelmed by the amount of them.

    They're also looking for more feedback and more contributors. Check out the project and see how you can help!

    The Human Aspect of Red Teams (ep. 379)

    Red team exercises set goals to see if a particular outcome can be accomplished through a simulated attack, but the ultimate outcome should be educating the org about how to improve tools and processes that make attacks more difficult to succeed.

    Gwyddon "Data" Owen shares his experience building a red team, creating an exercise, and leveraging the results to improve security. And while the adoption of LLMs will accelerate a red team's activities, there are still plenty of foundational security controls that orgs can establish that would require a red team to be more than just fast, but fast and very, very careful.

    Top 10 Web Hacking Techniques of 2025 and a Hint for 2026 (ep. 380)

    Portswigger's list of web hacking techniques is a long-running celebration of curiosity and research from the web hacking community. James Kettle shares his thoughts on the entries from 2025 and how he expects LLMs and agents to influence what the list will look like for next year. He also shares some insights on using LLMs for his own blackbox research, giving us a peek into the work he'll be sharing at Black Hat USA this summer.

  • EmDash Emphasizes Secure Design Apr 9, 2026
    City gates of London

    Courtesy British Library (Maps K.Top.27.25)

    We covered Cloudflare’s EmDash project as an example of the kind of appsec future I’d like to see. EmDash is the “spiritual successor to WordPress” that has one very specific design choice that caught my eye – sandboxing plugins.

    You can’t look at a WordPress plugin without tripping over an XSS, SQL injection, or RCE. Their vulns are ubiquitous and boring. I think we’ve only ever covered maybe two of them on the podcast. WordPress core is relatively secure, but that feels like faint praise when the core can’t protect itself from a plague of plugins with poor security.

    Unlike WordPress, where plugins essentially execute unconstrained within the boundaries of the core, EmDash plugins must explicitly state the capabilities they need and are restricted to those capabilities. The article shows an example of a plugin that sends an email. That example highlights several positive security benefits:

    • Plugin capabilities are static and inspectable at install time. They can’t dynamically modify themselves or mutate unexpectedly.
    • Capabilities not declared are denied. It feels trite to use a phrase like, “Default deny is good” in 2026, but it’s effective and should be the expectation for any component that must have a security boundary.
    • Capabilities are simple, expressive, and granular. There are eleven right now, with human-friendly names and the contexts they grant access to.

    In WordPress, you have to go through a list of worries about every plugin. What does it actually do? What content, files, or tables does it touch? What network calls does it make? Answering those questions typically relies on grep and trust – the plugin’s reputation and popularity. And those questions need to be asked and answered on every point release considering that a trusted author’s account might be compromised to push a malicious update.

    Ember doesn’t erase those concerns, but it definitely minimizes them. It’s much more confidence building to be able to inspect a manifest for “network:fetch” or “network:fetch:any”.

    (As an aside, Cloudflare also explains how their Workers, built on top of V8 isolates, are designed for performance and security. That security design also makes them “resistant to the entire range of Spectre-style attacks”, which is another wonderful example of avoiding a vuln class altogether. What I appreciate most about that write-up is how it presents a threat model, then discusses the high-level architecture and processes used to address those threats.)

    Of course, there’s some self-interest in Cloudflare creating a project like this. It highlights the security architecture they’ve invested in for their platform – which, obviously, I’m a fan of (this site runs on Cloudflare pages). EmDash also introduces first-class support for the new x402 standard for "internet-native payments." Given that x402 originated from Coinbase, its current state translates to microtransactions with blockchains and stablecoins.

    But if financial transactions are going to be part of a CMS, I’d much rather see them on a platform with the plugin security design of EmDash rather than WordPress.

    Oh, and yeah, I read through the article a few times to make sure it wasn’t an April Fools’ joke based on its publication date. It’s also coincidentally authored by two Matts – Matt “T.K” Taylor and Matt Kane. They’re not as famous as the Matt of Automattic that owns WordPress.com (the commercial side and contributor to the open-source wordpress.org, who took a quite antagonistic turn towards the open source ethos). I very much prefer the future of CMS design like what EmDash has done.

    And that’s why I like EmDash as an example of the future of appsec. It doesn’t even have to be called appsec – and likely won’t. It’s secure software engineering. It’s a project that identified common security failures, evaluated solutions, and created a design that eliminated or minimized broad types of flaws.

    I’d rather read this kind of architecture discussion than read about yet another XSS.

    (Post adapted from the original one on LinkedIn.)

  • Towards Identifying the Economics and Efficiency of Fuzzers vs. Agents Apr 6, 2026
    Departue of traveling parties in the arctic

    Courtesy British Library (1875.c.19)

    Agents and LLMs have gained favor as the method for finding flaws, but how would we measure their economics and efficiency against a decade of successful fuzzing? As methods for bug hunting, they're neither mutually exclusive nor so overlapping as to be redundant. So how would we design a process for deciding which one to run and when?

    Fuzzing has had a great success! "As of May 2025, OSS-Fuzz has helped identify and fix over 13,000 vulnerabilities and 50,000 bugs across 1,000 projects."1

    I've always loved fuzzing as a way to find software quality problems. Some of those problems have security impacts, others are implementation mistakes. All of them are crashes that should be fixed. They have a high signal on the quality spectrum.

    Megacycles and Megavolume

    In the past six months or so, we've seen a big attention shift to LLMs finding flaws across open source projects from the Linux kernel to memory misuse in C-based projects to fun findings in Vim and Emacs. We've shifted from burning CPU cycles for fuzzing to burning GPU cycles for agents.

    Clearly, the UX and onboarding steps to run an agent against a codebase is far superior to using a fuzzer – write a sentence or two and you're done. I'll never diminish the importance of UX for any tool, especially in security.

    But it still makes me wonder about how to evaluate the economics and efficiency of running a fuzzer vs. an agent (or collection thereof) against a codebase. There's a one-time investment in instrumenting a project with a fuzzer, followed by much lower maintenance and letting it run. And the nature of fuzzing is more likely to trigger memory safety issues, although it still has the potential for other classes of vulns like path traversal and security boundaries with weak logic.

    Megacost or Microexpense

    Is there any research on cost comparisons of fuzzing vs. LLMs? Any good papers on token costs related to running agents as code reviewers? I've tracked a few articles about CTF-style research by agents that puts token costs at around ~$10 per run per file (with ~3-4 runs to guarantee a finding) and the average AIxCC costs at around $152 per competition task.

    A critical step in an evaluation of efficiency would be to normalize that cost between agents and fuzzing. Using per repo is too coarse. Per file or per LOC might be better since it's more granular. But AIxCC's per task might be best in terms of findings, assuming that "task" can be sufficiently defined. It's also imporant to note that AIxCC had several mixed approaches, from "AI-first with traditional validation" to "systems rooted in fuzzing...and enhanced them with LLMs."2

    I'd love to find any updated resources or references on this economic aspect of agents.3 Let me know where I should be looking!

    Discovery vs. Analysis

    I noted that fuzzers have high signal. If they cause an app to crash, that's a bug to be fixed. But that bug isn't necessarily one that impacts security (aside from the generic availability problem of crashing). LLMs, on the other hand, have the potential to craft an exploit for a bug. Having an exploit adds context that helps prioritize and better understand the consequences of a bug.

    But here's where I'd also distinguish what audience is taking in that context and what action they expect to take. There's a difference between an org trying to figure out how to keep thousands of dependencies up to date and a project owner improving their own code quality. Not that project owners don't have their own priorities and time pressures, but sometimes a bug takes less time to fix than it does it thoroughly analyze it.

    I'd rather development teams focus on fixing bugs and refactoring their architecture to reduce their attack surface and eliminate classes of vulns. Yet I begrudgingly acknowledge that security teams want some sort of analysis about bugs. Not that they always need such an analysis, but they sure seem to want them from CVEs.

    Where I'll most closely watch the discovery vs. analysis distinction is in the Linux kernel. The kernel devs have a very specific attitude towards bugs:

    ...due to the layer at which the Linux kernel is in a system, almost any bug might be exploitable to compromise the security of the kernel, but the possibility of exploitation is often not evident when the bug is fixed. Because of this, the CVE assignment team is overly cautious and assign CVE numbers to any bugfix that they identify.

    Criteria and Considerations

    So after all this preamble, which one wins? What does it even mean to win? How do we design a process that's cost-effective and efficient with fuzzers and agents and tokens and processors?

    I no longer toy with the hypothesis that fuzzing is cycle-for-cycle more cost-effective than agents at discovering bugs. It feels like the ship has sailed in terms of agents being embraced for security.

    Thus, I'll combine discovery and analysis and reframe the question to, "What's a cost-effective method of identifying software quality issues?"

    • What are the operator costs to establish and maintain a harness for fuzzing, for agents?
    • What's the operator UX for using a fuzzer, an agent?
    • What are the CPU costs to execute a fuzzer, an agent?
    • What is the cost per LOC? Is LOC even a good denominator?
    • What constraints detract from either approach's success? e.g. a fully compiling app for fuzzers vs. a single file (or even PR!?) for agents, code complexity that fuzzing is mostly agnostic to vs. context windows for agents, programming languages and compiled vs. interpreted languages

    Notably, my personal criteria cares less about volume (although I do care about variety of vuln classes), because I want to avoid the trap of maximizing a CVE count. Chasing vulns without a strategy to avoid them is just BugOps.

    Quality as Consequence

    All of the previous criteria are about economics and efficiency of finding bugs. But my actual success criteria boils down to a simpler motivating question:

    "What fosters better software design that improves code quality and reduces the prevalence of vuln classes?"

    Finding bugs is important, exploiting them can be fun, but for me the most rewarding thing is preventing them in the first place.

    (Post adapted from my original one on LinkedIn on April 6, 2026.)


    1. From the "Trophies" section at https://github.com/google/oss-fuzz. Sadly, it hasn't been updated for 2026. ↩︎

    2. AIxCC finals: Tale of the tape ↩︎

    3. Two articles that seeded this idea were an OpenSSL flaw and OSS-Fuzz incorporating LLMs. ↩︎

  • ASW Recap for March 2026 Apr 3, 2026
    a person wearing a rabbit costume standing in front of a shed

    Photo by Ksenia Yakovleva on Unsplash

    March meandered through C code, mused about secure design, marked a new top ten list, made space for machines, and finally descended into a bit of madness. And every single moment was fun!

    Modern AppSec That Keeps Pace with AI Development (ep. 372)

    As more developers turn to LLMs to generate code, more appsec teams are turning to LLMs to conduct security code reviews. One of the biggest themes in all the discussion around LLMs, agents, and code is speed — more code created faster. And with that volume comes more vulns.

    James Wickett shares why speed continues to pose a challenge to appsec teams and why that’s often because teams haven’t invested enough in foundational appsec principles.

    One of the traps in appsec is getting caught up in the volume of security bugs and conflating an acceleration of finding and fixing those vulns with a security strategy. Proactive security that eradicates vuln classes (or makes them very hard to introduce) is always going to be more effective than eternally chasing individual bugs.

    Making Medical Devices Secure (ep. 373)

    Medical devices are a special segment of the IoT world where availability and patient safety are paramount. Tamil Mathi explains why many devices need to fail open — the opposite of what traditional appsec approaches might initially think — and what makes threat modeling these devices interesting and unique. He also covers how to get started in this space, from where to learn hardware hacking basics to reviewing firmware and moving up the stack to the application layer.

    This is one of those episodes that highlights the breadth of industries that appsec covers and why context about the intention of features and the needs of users is so important to threat modeling. Having to design a device where availability is paramount and critical to patient safety requires different tradeoffs than reducing the latency and protecting payment credentials for an online purchase.

    Creating Better Security Guidance and Code with LLMs (ep. 374)

    What happens when secure coding guidance goes stale? What happens when LLMs write code from scratch? Mark Curphy walks us through his experience updating documentation for writing secure code in Go and recreating one of his own startups.

    One of the themes of this conversation is how important documentation is, whether it's intended for humans or for prompts to LLMs. Importantly, LLMs don't innovate on their own – they rely on the data they're trained on. And that means there should be good authoritative sources for what secure code looks like. It also means that instructions to LLMs need to be clear and precise enough to produce something useful.

    This was also fun because Mark did a live demo where he prompted an agent to recreate one of his startups – going from a $20 million investment to less than $50 in tokens!

    Why Proactive Security Is Far Better Than Patching (ep. 375)

    So much (too much!) of appsec’s efforts are consumed by vuln management and a race to patch security flaws. I see that as a symptom of the ease of scanning and the volume of CVEs. It’s something that’s easy to measure, both in finding and fixing, but easy to measure turns into an easy distraction. Erik Nost walks through the principles behind proactive security, why the concept sounds familiar to secure by design, and why organizations still struggle with creating effective practices for visibility.

    I don’t think it’s a waste of time to find flaws in code, but I wish the appsec conversations were more heavily weighted towards identifying root causes and sharing software patterns for preventing those flaws in the first place. Which is basically saying we need better ways to emphasize and evaluate secure designs.

    Developing the Skills Needed for Modern Software Development (ep. 376)

    The future of secure software is going through a mix of skills expected of humans and skills files created for LLMs. We might even posit that appsec as a discipline will fade (and that might not even be a bad thing!). Keith Hoodlet describes the skills he was looking for in building teams of security researchers and why there's still an emphasis on the ability to learn about and understand how software is built.

    But figuring out what skills will get you hired and what skills are valuable to invest in still feels daunting to new grads and others entering the security industry. We discuss where the role of appsec seems to be heading and a few of the security and software fundamentals that can help you follow that direction.

1 2 3 4 5 ... 28

Dangerous Errors

  • zombie
  • mutantzombie
  • mutantzombie.bsky.app
  • SecurityWeekly

Cybersecurity and more | © Mike Shema