Security News Mozilla says 271 vulnerabilities found by Mythos have “almost no false positives”

Brownie2019

Level 23
Thread author
Verified
Well-known
Forum Veteran
Mar 9, 2019
989
5,004
2,168
Germany
The disbelief was palpable when Mozilla’s CTO last month declared that AI-assisted vulnerability detection meant “zero-days are numbered” and “defenders finally have a chance to win, decisively.” After all, it looked like part of an all-too-familiar pattern: Cherry-pick a handful of impressive AI-achieved results, leave out any of the fine print that might paint a more nuanced picture, and let the hype train roll on.

Mindful of the skepticism, Mozilla on Thursday provided a behind-the-scenes look into its use of Anthropic Mythos—an AI model for identifying software vulnerabilities—to ferret out 271 Firefox security flaws over two months. In a post, Mozilla engineers said the finally ready-for-prime-time breakthrough they achieved was primarily the result of two things: (1) improvement in the models themselves and (2) Mozilla’s development of a custom “harness” that supported Mythos as it analyzed Firefox source code.

“Almost no false positives”
Full Story here:
 
Full Story here:
AI can clearly help with vulnerability discovery, but “zero-days are numbered” is still a very strong claim.

What this likely means

If Mozilla’s numbers are accurate, the important part is not just the model itself, but the surrounding workflow:

  • A specialized model tuned for vulnerability research
  • A custom harness to give the model better context and structure
  • Human validation before findings are treated as real issues

That is much more believable than the idea of an AI model simply reading a huge codebase and reliably finding critical bugs on its own.

Why the result matters

Finding 271 flaws in two months with “almost no false positives” would be significant because false positives are one of the biggest practical problems in automated security testing.

In general, many automated tools can produce large numbers of low-value findings. If Mozilla genuinely reduced that noise, then the real advance may be in making AI-assisted triage useful enough for daily engineering work, not in “solving” vulnerability discovery.

Why some skepticism is still reasonable

There are a few points that still matter before drawing broad conclusions:

  • “Vulnerabilities found” does not automatically mean critical or exploitable in practice
  • The severity distribution matters a lot
  • Independent reproduction would matter more than a vendor-style success story
  • Results in Firefox may not transfer cleanly to other codebases, languages, or development practices
  • “Almost no false positives” is encouraging, but the exact benchmark and validation method matter

A tool can be very good at surfacing real bugs while still missing entire classes of vulnerabilities. That is why I would treat this as a strong engineering result, not proof that defenders now have a decisive long-term advantage.

The broader security takeaway

AI is probably going to improve both sides:

  • Defenders can audit large codebases faster
  • Developers can catch classes of bugs earlier
  • Attackers can also use similar systems to find weak points more efficiently

So the likely outcome is not the end of zero-days, but a faster and more automated vulnerability discovery race.

Bottom line

Mozilla’s reported result sounds important if the validation holds up, especially the low false-positive rate. But it is best understood as AI becoming a more practical security research assistant, not as the end of zero-days or a decisive win for defenders.
 
Yeah, I agree with:

@Bot

On the defenders' side, AI identifies vulnerabilities but has no idea how to fix them,that requires human expertise.
On the attackers' side, AI has no idea how to exploit those vulnerabilities remotely.
But there’s no doubt that AI can be helpful to both sides.
That is mostly fair, but I would narrow it a bit.

Where I agree

AI is already useful for both defenders and attackers in areas such as:

  • Finding suspicious code patterns faster
  • Prioritizing likely weaknesses
  • Assisting with variant analysis
  • Speeding up repetitive research tasks

And yes, human expertise is still the deciding factor in many real-world cases, especially when moving from “possible bug” to “reliable fix” or “working exploit.”

Where I would be more careful

Saying AI has “no idea” how to fix or exploit vulnerabilities is probably too absolute.

In general, current AI systems can sometimes help with:

  • Suggesting candidate patches
  • Explaining why a bug may occur
  • Generating proof-of-concept code
  • Highlighting exploitation paths that a human researcher can test

The limitation is reliability, not total inability. The output may be incomplete, unsafe, or simply wrong, so human review remains essential.

Practical reality

For defenders:

  • AI can help surface issues
  • Humans still need to verify impact
  • Humans still need to design and validate robust fixes

For attackers:

  • AI may help analyze code and brainstorm attack paths
  • Turning that into a dependable remote exploit is much harder
  • That step usually still requires substantial manual skill and testing

Bottom line

AI is best viewed as a force multiplier, not an autonomous vulnerability engineer. It can assist in finding, analyzing, and sometimes even proposing fixes or exploit ideas, but human expertise is still what turns those outputs into something trustworthy or operational.
 
@Bot

Often, the solutions proposed by AI are not entirely accurate; they are usually a list of suggestions, many of which are incorrect. However, there is no doubt that they can help humans think more clearly.

From my perspective, AI helps reduce the time needed to come up with a brilliant idea for solving a complex problem, allowing it to be resolved more quickly.
 
@Bot

Often, the solutions proposed by AI are not entirely accurate; they are usually a list of suggestions, many of which are incorrect. However, there is no doubt that they can help humans think more clearly.

From my perspective, AI helps reduce the time needed to come up with a brilliant idea for solving a complex problem, allowing it to be resolved more quickly.
That is a reasonable way to frame it.

What AI is good at

In practice, AI often works best as an accelerator rather than a source of final answers:

  • It can generate multiple candidate explanations quickly
  • It can suggest several possible solution paths
  • It can help organize messy technical information
  • It can shorten the time from “I have no direction” to “I have a few viable directions to test”

That can be very valuable when working on complex debugging, vulnerability triage, or patch design.

What AI is still bad at

The main problem is that plausibility is not the same as correctness.

  • Some suggestions will be partially right
  • Some will be overly generic
  • Some will be confidently wrong
  • Some may ignore important edge cases or security trade-offs

So the real benefit is often not that AI produces the brilliant solution by itself, but that it helps a skilled person reach a useful idea faster.

Why that still matters

Even imperfect assistance can have a big impact if it reduces the time spent on low-value trial and error.

For example, if a researcher or developer gets to a solid hypothesis in 20 minutes instead of 2 hours, that is a meaningful productivity gain even if most of the AI’s earlier suggestions were discarded.

Security angle

In security work, that time compression matters on both sides:

  • Defenders may triage and patch faster
  • Researchers may audit more code in less time
  • Attackers may also reduce research time for finding weak points

That is why I think the most realistic impact of AI is not “automatic genius,” but faster iteration.

Bottom line

Your point is strong: AI often does not deliver the final correct solution, but it can reduce the time needed to reach a good one. Used carefully, that alone is enough to make it genuinely useful.
 
  • Like
Reactions: Sampei.Nihira