Heather Adkins, Google’s vice president of security, announced Monday that its LLM-based vulnerability researcher Big Sleep found and reported 20 flaws in various popular open source software.
Yeah, like maybe this is one of those AIs that is actually just a guy in the Philippines being paid shit wages. Or maybe it’s a dumb LLM that makes lots of mistakes. Or maybe it’s all just bullshit from TechCrunch where an underpaid journalist is just recycling a fucking press release from Google and none of this actually happened anything like how it’s written.
The last time Google did a media run about Deepmind finding bugs, it related to a vulnerability on an dev branch that hadn’t been deployed yet (and was not likely to have been with the vulnerability).
I don’t think anyone is suggesting that it is impossible for an LLM to find any vulnerabilities?
But right now we are specifically discussing the costs of a breach, and your post that I responded to specifically relied on a bug not being identified a person.
The discussion isn’t whether an LLM can identify bugs, it’s whether it can do so in a useful way. In the single previous example, it was not useful.
But similar to the last time, it is likely that the limited utility will only be known until well after the breathless reporting on how amazing AI is
In the example you provided, it found a vulnerability, which is useful, but they didn’t point it at production code. The vulnerability might have been found by other tests and code reviews or it might have not been. The question of whether it’s valuable or not really depends on what sort of code we’re talking about and what the cost of missing a vulnerability would be.
All I’m saying here is that AI is just another tool that helps find bugs. People here freaking out over the idea that there might be legitimate uses for AI is kind of hilarious to be honest.
I mean if these tools help catch any issues in automated fashion that’s still a win.
The false positive rate makes them a net loss.
https://daniel.haxx.se/blog/2025/07/14/death-by-a-thousand-slops/
That article isn’t referring to the specific system google is using, so we don’t know what the false positive rate is.
Uh pretty high if it’s an LLM
That’s not a given.
It’s literally the 2nd paragraph lmao
Heather Adkins, Google’s vice president of security, announced Monday that its LLM-based vulnerability researcher Big Sleep found and reported 20 flaws in various popular open source software.
what specifically do you think this paragraph says lmao
But it is likely.
It really depends on how their particular system is set up. You’re just making sweeping vibe based statements without any evidence to support them.
Yeah, like maybe this is one of those AIs that is actually just a guy in the Philippines being paid shit wages. Or maybe it’s a dumb LLM that makes lots of mistakes. Or maybe it’s all just bullshit from TechCrunch where an underpaid journalist is just recycling a fucking press release from Google and none of this actually happened anything like how it’s written.
Or maybe new technology actually has valid applications despite the hype associated with it.
They found ten issues, but how many hours spent filtering out the false positives?
We don’t know, however of this is security related issues then it doesn’t matter. The cost of a breach would be obviously higher.
compare to the cost of humans finding them the normal way, not whatever breach you’re imagining.
Clearly the humans didn’t find them the normal way, because they wouldn’t be there to be found otherwise would they?
The last time Google did a media run about Deepmind finding bugs, it related to a vulnerability on an dev branch that hadn’t been deployed yet (and was not likely to have been with the vulnerability).
So it found a vulnerability in the code it was given. 🤷
I don’t think anyone is suggesting that it is impossible for an LLM to find any vulnerabilities?
But right now we are specifically discussing the costs of a breach, and your post that I responded to specifically relied on a bug not being identified a person.
The discussion isn’t whether an LLM can identify bugs, it’s whether it can do so in a useful way. In the single previous example, it was not useful.
But similar to the last time, it is likely that the limited utility will only be known until well after the breathless reporting on how amazing AI is
In the example you provided, it found a vulnerability, which is useful, but they didn’t point it at production code. The vulnerability might have been found by other tests and code reviews or it might have not been. The question of whether it’s valuable or not really depends on what sort of code we’re talking about and what the cost of missing a vulnerability would be.
All I’m saying here is that AI is just another tool that helps find bugs. People here freaking out over the idea that there might be legitimate uses for AI is kind of hilarious to be honest.
We don’t know the details yet. Maybe they have a great new tool; perhaps they picked projects that are not maintained so well.
It will be awesome if they found bugs in curl, not so good to show if they picked my project.
What they did will be revealed in time
I’m sure we’ll get more info in due time.
Yes, hopefully in a couple of weeks