Andy Ful
From Hard_Configurator Tools
Thread author
Verified
Honorary Member
Top Poster
Developer
Well-known
- Dec 23, 2014
- 8,916
Post updated in April 2025.
There are about 300.000 new malware threats every day (Windows OS).
Let's consider the example of the initial pool of 30.000 sufficiently different malware variants in the wild and the particular AV that failed to detect 100 of them.
Next, we make for the above results a trial to choose 380 samples from these 30.000 and calculate the probabilities for finding in these 380 samples 0, 1, 2, or 3 undetected malware.
m=30000
n=380
k=100
As it can be easily calculated, the probability of finding x=0, 1, 2, 3, ... undetected malware is as follows:
p( x ) = B( m - k , n - x ) * B( k , x ) / B( m , n )
where B( p , q ) is binomial coefficient.
After some simple calculations, we have:
p(x) = ( m-k )! * k! * ( m - n )! * n! / [ x! * ( k - x )! * ( n - x )! * ( m - k - n + x )! * m! ]
For sufficiently large numbers of samples in the wild ( m >> k , n ) and a small number of missed samples ( x << n ), the function p(x) depends on the infection rate ( r = k/m ) and the number of tested samples ( n ):
p( x ) ~ B( n , x ) * r ^ x * (1 - r ) ^ ( n - x )
So, increasing the number of in-the-wild samples does not change significantly the probabilities if the infection rate k/m does not change and m is big enough.
Here are the results of calculations for the number of tested samples n=380, infection rate k/m=1/300, and x = 0,1,2, 3:
p(0)=0.28
p(1)=0.36
p(2)=0.23
p(3)=0.1
These probabilities show that one particular AV can have a different number of undetected malware (0, 1, 2, 3, ...) when we preselect a smaller pool of samples from the much larger set.
We can compare these probabilities with the results of the AV-Comparatives Real-world test (July-August 2020):
4 AVs with 0 undetected malware
5 AVs with 1 undetected malware
3 AVs with 2 undetected malware
1.5 AVs with 3 undetected malware (I added 0.5 AV for Norton)
www.av-comparatives.org
We can calculate the ratios of the probabilities and numbers of AVs for the particular numbers of undetected malware:
p(0)/p(1) = 0.79 ~ 4 AVs/5 AVs
p(0)/p(2) = 1.2 ~ 4 AVs/3 AVs
p(1)/p(2) = 1.6 ~ 5 AVs/3 Avs
p(0)/p(3) = 2.9 ~ 4 AVs/1.5 AVs
p(1)/p(3) = 3.7 ~ 5 AVs/1.5 AVs
p(2)/p(3) = 2.4 ~ 3 AVs/1.5 AVs
etc.
As we can see, the AV-Comparatives test results for AVs with 0, 1, 2, or 3 undetected malware are very close to the results of the random trials for one particular AV.
It means that F-Secure, G-Data, Panda, TrendMicro, Avast, AVG, BitDefender, Avira, Eset, K7, Microsoft, and Norton could have the same number of undetected malware in the wild (July and August). But anyway, they would have different numbers of undetected samples in the July_August test by pure statistics.
Conclusion.
One test with 380 malware samples is not especially reliable for a period of two months.
Even if the in-the-wild malware detection is the same for any two AVs, they can easily score as 0 undetected malware or 2 undetected malware.
Edit 1.
Post shortened. Added the approximate formula for p(x) and used it to calculate probabilities (instead of the exact formula).
Edit 2.
We do not know how exactly the AV Labs choose the malware samples. But most probably, they chose the test samples from large feeds (over 300,000 suspicious and malicious threats per day) and then remove some morphed samples of the same malware. If so, the approximate formula for p(x) is very accurate.
The example of the malware feed:
www.mrg-effitas.com
Edit 3.
Corrected the typo error in the formula for p(x).
There are about 300.000 new malware threats every day (Windows OS).
Let's consider the example of the initial pool of 30.000 sufficiently different malware variants in the wild and the particular AV that failed to detect 100 of them.
Next, we make for the above results a trial to choose 380 samples from these 30.000 and calculate the probabilities for finding in these 380 samples 0, 1, 2, or 3 undetected malware.
m=30000
n=380
k=100
As it can be easily calculated, the probability of finding x=0, 1, 2, 3, ... undetected malware is as follows:
p( x ) = B( m - k , n - x ) * B( k , x ) / B( m , n )
where B( p , q ) is binomial coefficient.
After some simple calculations, we have:
p(x) = ( m-k )! * k! * ( m - n )! * n! / [ x! * ( k - x )! * ( n - x )! * ( m - k - n + x )! * m! ]
For sufficiently large numbers of samples in the wild ( m >> k , n ) and a small number of missed samples ( x << n ), the function p(x) depends on the infection rate ( r = k/m ) and the number of tested samples ( n ):
p( x ) ~ B( n , x ) * r ^ x * (1 - r ) ^ ( n - x )
So, increasing the number of in-the-wild samples does not change significantly the probabilities if the infection rate k/m does not change and m is big enough.
Here are the results of calculations for the number of tested samples n=380, infection rate k/m=1/300, and x = 0,1,2, 3:
p(0)=0.28
p(1)=0.36
p(2)=0.23
p(3)=0.1
These probabilities show that one particular AV can have a different number of undetected malware (0, 1, 2, 3, ...) when we preselect a smaller pool of samples from the much larger set.
We can compare these probabilities with the results of the AV-Comparatives Real-world test (July-August 2020):
4 AVs with 0 undetected malware
5 AVs with 1 undetected malware
3 AVs with 2 undetected malware
1.5 AVs with 3 undetected malware (I added 0.5 AV for Norton)

Real-World Protection Test July-August 2020 - Factsheet
Take a look at the Real-World Protection July-August 2020 Factsheet, a short overview of the results, more details will in the full release!

We can calculate the ratios of the probabilities and numbers of AVs for the particular numbers of undetected malware:
p(0)/p(1) = 0.79 ~ 4 AVs/5 AVs
p(0)/p(2) = 1.2 ~ 4 AVs/3 AVs
p(1)/p(2) = 1.6 ~ 5 AVs/3 Avs
p(0)/p(3) = 2.9 ~ 4 AVs/1.5 AVs
p(1)/p(3) = 3.7 ~ 5 AVs/1.5 AVs
p(2)/p(3) = 2.4 ~ 3 AVs/1.5 AVs
etc.
As we can see, the AV-Comparatives test results for AVs with 0, 1, 2, or 3 undetected malware are very close to the results of the random trials for one particular AV.
It means that F-Secure, G-Data, Panda, TrendMicro, Avast, AVG, BitDefender, Avira, Eset, K7, Microsoft, and Norton could have the same number of undetected malware in the wild (July and August). But anyway, they would have different numbers of undetected samples in the July_August test by pure statistics.
Conclusion.
One test with 380 malware samples is not especially reliable for a period of two months.
Even if the in-the-wild malware detection is the same for any two AVs, they can easily score as 0 undetected malware or 2 undetected malware.
Edit 1.
Post shortened. Added the approximate formula for p(x) and used it to calculate probabilities (instead of the exact formula).
Edit 2.
We do not know how exactly the AV Labs choose the malware samples. But most probably, they chose the test samples from large feeds (over 300,000 suspicious and malicious threats per day) and then remove some morphed samples of the same malware. If so, the approximate formula for p(x) is very accurate.
The example of the malware feed:
Threat Feeds - MRG Effitas
MRG offers threat feeds containing suspicious and malicious binaries, URLs and categorised malware, many less than 24 hours old.

Edit 3.
Corrected the typo error in the formula for p(x).
Last edited: