The Anthropic Red Team has a blog out titled Evaluating and mitigating the growing risk of LLM-discovered 0-days. It’s a fine blog and worth a read. The TL;DR is that LLMs can find security vulnerabilities. I spoke with Joshua Rogers on Open Source Security about this very topic back in October. The TL;DR was the same back then also.

I don’t think the top of the Anthropic blog that talks about how this all worked is the most interesting. It’s clever sure, but the real questions are the last two sections of the post.

The second last section talks about Safeguards. They’re going to try and prevent misuse. This is probably impossible to do. Even if they found a way to only let the “good guys” in, there are plenty of other model providers, and there are open models. The open models are becoming pretty good honestly. I’ve given the new Qwen coder a spin and it’s not bad. More on that in a future post. But the point of it all is that the bar for any sort of automation to be better than a human at code review or vulnerability research is a pretty low bar. If you’ve ever tried to review code for security vulnerabilities you know what I mean. It’s extremely difficult for too many reasons to list in this post.

The last section is the one that deserves the most attention. This quote

Looking ahead, both we and the broader security community will need to grapple with an uncomfortable reality: language models are already capable of identifying novel vulnerabilities, and may soon exceed the speed and scale of even expert human researchers.

This has probably already happened. Given the complexity of all this, you want a human in the loop to avoid spamming the world with slop.

And the second to last paragraph

At the same time, existing disclosure norms will need to evolve. Industry-standard 90-day windows may not hold up against the speed and volume of LLM-discovered bugs, and the industry will need workflows that can keep pace.

This one is a bag of snakes. The 90 day window is already shorter than a lot of organizations can actually handle. But this isn’t what concerns me the most. I want to consider some second order effects.

Most open source projects are already running on the brink of burnout. A lot are banning AI generated PRs because of the level of slop. Sure some AI PRs are OK, but a lot aren’t. This isn’t the fault of the AI, it’s the fault of the people using AI. They are painting pictures with their own poop and demanding we stop convulsing as they shove their pictures in our face. As you try to banish that delightful image from your mind, if there’s a sudden influx of security vulnerabilities, even if the reports are 100% correct, that’s going to create a lot of work, and I mean A LOT. Patching security vulnerabilities is a lot harder than normal bugs or features. There’s a high level of mental anguish that comes with security patches.

But then downstream of the open source projects. If we suddenly have hundreds of open source security updates every week, we don’t know to handle that level of change. Even if you’re an AI idealist, even with an infinite token budget, this pace of change isn’t going to be sustainable.

Are we all doomed?#

Probably not. Here’s what I think is going to happen

There will be a glut of slop security reports. But those reporters are going to end up banned from open source projects. The amount of work needed to properly report a security bug is way more work than anyone willing to submit a slop report can tolerate. A few might figure it out, but I don’t think most will.

But there are researchers who can and will carry the water and chop the wood. Folks like Joshua Rogers will do the work needed, help with patches, be patient with overworked projects, and help improve the process as they go. It’s going to take time, and it’s going to be slog. But this has always been true. The smart people will figure out how to deal with this.

Two roads diverged in a wood - one is filled with slop, and the other is filled with probably bears.