Do you really understand AV test results?

Andy Ful · Dec 9, 2017

Many people think so. I have also thought, that it is a simple thing to understand. But ..., I was wrong.
I realized, that it is possible that AV with a lower detection result (static + dynamic), can give a better protection than another AV with a higher detection result.
It sounds like a kind of paradox, but please read first the post, and then you can decide for yourself.
.
The possibility of using Artificial Intelligence (AI) locally and in the cloud (extended AI), made AV testing more complicated, than a few years ago. Artificial Intelligence can recognize the suspicious behavior on the infected computer and create the malware (postinfection) signature. The more signatures it creates, the smarter it is, and the better is AI behavior monitoring. So, the better signatures will be created in a shorter time. For example, BitDefender 2017 Anti-Ransomware module, can create in the cloud the postinfection signature up to 30 minutes. Next, it takes approximately 3 hours to propagate it on all endpoints not connected to the cloud via product update. So, in the BitDefender 2017 AV cloud, the 0-day ransomware in the wild will live only up to 30 minutes (thanks to postinfection signature), as compared to several hours in the standard AV clouds.
With the above scenario (AI included), the fact of infecting before the test one of the computers connected to the AV cloud, can have a substantial influence on the AV malware detection afterward. It follows from the fact, that postinfection signatures can be created by AI in the cloud in some minutes.
.
The conclusion is that actual real-world protection 0-hour (0-day) test, should:

Contain only the 'true real-world samples' that already infected every tested AV cloud (hard to be done in practice).

or

Make use of the repeated tests.

.
In repeated tests, the malware samples that bypassed AV protection, are checked again after some minutes. This procedure should be repeated for all not-detected samples. For some samples, several repetitions will be required up to one hour (for 0-hour tests). Those samples that could bypass AV protection in all repetitions are counted as undetected (valued as 0), otherwise they are counted as partially detected (valued as something between 0 and 1).
The malware detection will be valued slightly less than 1, when it is detected just a few minutes after the first test.
.
Why some AVs are scoring high in tests.
BitDefender example for 0-hour real-world malware test.

The non-standard signatures of 'malware caught by Bitdefender AI on customer computers' are excellent.
The standard (before execution) Bitdefender signatures are excellent.
The non-signature protection is excellent.

.
In this case, the result of 0-hour protection standard test will be excellent.
Also, from the fact that BitDefender has decent standard signatures, it follows that the real-world test samples, that happened to be new for BitDefender, are very rare (if any).
.
Microsoft Defender example for 0-hour malware test.

The non-standard signatures of 'malware caught by Defender AI on customers computers' are (probably) excellent.
The standard (before execution) Defender signatures are poor.
The non-signature protection is (probably) excellent.
The number of test samples that will require creating postinfection signatures by Defender AI can be statistically important.

.
The point 4. is the key factor of the bad opinion about Defender. If Defender could block 99.99% of malware samples then AI should create the postinfection signatures very rarely (0.01%). As we know this is not the case, and Defender cannot block malware so efficiently in known tests. That is why postinfection signatures are important for Defender.
.
The detection results (static + dynamic) will be lower, for Defender cloud than in reality, because the results for samples from point 4. are not properly valued in the standard tests. They are always counted as undetected, but they should be counted as partially detected, because some minutes later that malware will be mostly stopped in Defender Cloud (AI creates postinfection signature).
I used the phrase '(probably) excellent', because the protection related to Defender AI (locally and in the cloud) was not directly measured (even when in theory it should work excellently). On the other side, the analogical Anti-Ransomware protection in BitDefender 2017 works excellently, so '(probably) excellent' for Defender is not just a stupid assumption. One should also bear in mind, that Defender can use Windows massive telemetry about the system events, and Microsoft has enormous resources to make Defender AI the best.
.
The 0-hour malware files have an important influence on 0-day detection scores, because the rate of infection is highest in the first few hours after the malware shows itself in the wild (about 30% infections happen in the first 4 hours).
Furthermore, when you read the malware test reports, you can see the substantial differences between AVs when testing the 0-day detection and the very small differences when testing the detection of malware discovered over a period of some weeks. So in fact, the better the 0-day result the better is a combined, final AV score.
Generally, AVs which use postinfection telemetry to create malware signatures on the fly, will have underestimated protection scores, especially when the shorter is the time required to create a malware signature, and worse the standard malware signatures. In real life, the worse standard signatures will be compensated by postinfection signatures created by AI.
.
So, are the AVs standard detection tests useless? I do not think so.
Let's take a Microsoft Defender example for one-month malware samples.

The non-standard signatures of 'malware caught by Defender AI on customers computers' are (probably) excellent.
The standard (before execution) Defender signatures are good (better than 0-hour or 0-day).
The non-signature protection is (probably) excellent.
The number of test samples that will require creating postinfection signatures by Defender AI, is not statistically important.
The one-month detection result should be near to excellent, because the number of malware samples that should be counted as partially detected (instead of undetected) is not statistically important now. Also, one-month standard Defender signatures are better than 0-hour or 0-day.

.
As an example, let's take the AV-Test one-month detection results (static + dynamic, average on a year interval October 2016 - October 2017):
BitDefender, FSecure, Kaspersky, McAffe, Norton, Trend Micro ~ 100%
Avast, Avira, Eset, Microsoft Defender . . . . . . . . . . . . . . . . . . . . ~ 99.7% - 99.9%
https://www.av-test.org/en/antivirus/home-windows/
.
As the second example, let's take the AV-Comparatives one month real-world detection results (static+dynamic, average on the interval of 4 months: July - October 2017, user dependent actions counted as infected):
Avira, BitDefender, FSecure, Kaspersky, Norton, Trend Micro ~ 99,6% -100%
Avast, Eset, McAffe, Microsoft Defender . . . . . . . . . . . . . . . . . ~ 98,9% - 99,5%
Real-World Protection Test - AV-Comparatives
.
The differences are very small, and one of the best AVs in the AV-Test scoring (McAfee) got the worse result in AV-Comparatives scoring. Also, BitDefender, FSecure, Kaspersky, Norton, and Trend Micro have excellent scores in both tests.
One should adopt other protective factors in purpose to differentiate AVs, especially based on the anti-0-day AV capabilities (like CyberCapture in Avast, 'Application Control' in Kaspersky, 'Advanced Threat Control' in BitDefender, DeepGuard in F-Secure, etc.), and other unique features.
.
It would be interesting to compare Avast cloud with Defender cloud.
In Avast 2017, the CyberCapture feature was introduced to fight the 0-day malware. If the EXE or DLL file is recognized as suspicious, then CyberCapture locks the file, and can upload it to the cloud for detailed analysis in a controlled environment. After the analysis is finished, which can take up to about 2 hours, the file is unlocked - and if malicious the signature of the malware is created on the fly, and Avast will quarantine the file. That feature gave Avast excellent detection results, as good as for the best AVs. CyberCapture is an idea to protect all users at the price of some inconvenience (file locked up to 2 hours). CyberCapture does not create postinfection signatures (the file is locked until analysis will be finished).
.
Defender in Window 10 adopted the known military solution: protect millions by sacrificing a few. The time for cloud analysis was shortened to 10 seconds (can be extended to 60s). If the file was recognized as dangerous the malware signature is created. After this, the file is unlocked and can be quarantined (infected) or executed (not infected). Such analysis is not as deep as in the case of Avast CyberCapture, so some malware files can slip through the detection procedure to infect the computer. Yet, the telemetry about suspicious system events is still continued, and analyzed by the local AI models and cloud AI extended models. When the signs of infection are recognized, then all data is reanalyzed in the cloud. If some files are recognized as malicious, then malware signatures are created and files are quarantined (the average time of creating postinfection signatures on the fly is not known). Defender AI can create both before-execution and postinfection signatures on the fly.
.
The final words.
I should underline, that the above reasoning is related to home users protection. Because of the targetted attacks and fast malware propagation in the big local networks, postinfection signatures are not so useful for Companies/Enterprises.
The second point is that I am not accusing anyone of publishing the wrong detection test results. I accept that detection results are correct within the boundaries of applied testing procedures.
One should remember anyway, that detection results for some AVs, cannot be easily transferred to protection results, especially when malware signatures can be also created on the fly by AI (in some minutes) on the basis of postinfection telemetry (Defender, BitDefender, etc.).
.
YouTube tests.
Homemade tests are not very useful, because they have many cons:

The test may be done by AV fanboy, so only the tests for an advantage of the concrete AV may be published.
The malware samples are not real-world. They are taken from some malware collection sites, and many of samples have no chances to infect any computer.
The number and diversity of samples are usually too small to be statistically representative.
Often only one AV is tested, so it is impossible to compare the results with other AVs (on the same pool of samples).
If there is more than one AV, then they are not tested at the same time - some hours can have a substantial influence on the results.
Sometimes, several AVs are tested one after one without refreshing the system, so in fact, they are tested on different system state.

.
MalwareTips tests.
I like them, because they show how the concrete AV can fight the concrete malware file. All samples have available the analysis on hybrid-analysis.com and virustotal.com . From those tests, one can see the strong and weak points of AV protection. For example, MalwareTips tests for Avast (from some last months) show that it has actually a very strong 0-day protection for EXE files (CyberCapture module), but not so strong protection against scripts and scriptlets.
They also show that most of the malicious documents use embedded or external scripts (scriptlets) to download and execute payloads. The script trojan-downloaders are often highly obfuscated, but after de-obfuscation is evident that in many cases they differ only with links to malware sites. Those are only some examples, the research potential is great.
So, those tests can be much more useful, as compared to tests on AVs scoring results.
.
Please, post here about thoughts related to AVs tests.

DeepWeb · Dec 9, 2017

Excellent writeup. The key is this:

Andy Ful said:
The differences are very small, and one of the best AVs in the AV-Test scoring (McAfee) got the worse result in AV-Comparatives scoring. Also, BitDefender, FSecure, Kaspersky, Norton, and Trend Micro have excellent scores in both tests.
One should adopt other protective factors in purpose to differentiate AVs, especially based on the anti-0-day AV capabilities (like CyberCapture in Avast, 'Application Control' in Kaspersky, 'Advanced Threat Control' in BitDefender, DeepGuard in F-Secure, etc.), and other unique features.

All the hooks, injections, false positives and extra vulnerability of a 3rd party AV just to go from 99.7% detection rate to 99.9% is simply not worth it in my opinion. Yes you give me that 0.2 percentage point extra in protection for all the other risks and breaking my favorite programs? Nope.

Windows Defender is catching up fast. I don't think that anyone will need a 3rd party AV by the end of 2018 especially now that they announced that they will release new builds of Windows Defender Antivirus every month. Comparing WD in December 2017 to WD in January 2017, it has improved at an exponential scale. Microsoft is learning fast and they are serious about making it superior to all of the alternatives.

Deleted member 65228 · Dec 9, 2017

DeepWeb said:
All the hooks, injections

If you enable specific features then you sort of have to meet them half way, because some features cannot really be done using non-hacky methods. Just white-list the software you don't want affected.

DeepWeb said:
Windows Defender is catching up fast. I don't think that anyone will need a 3rd party AV by the end of 2018 especially now that they announced that they will release new builds of Windows Defender Antivirus every month.

A lot of people do not like Microsoft due to the telemetry (despite other AVs collecting data) and may have different needs. Windows Defender is likely to not rise in that much popularity compared to now for this reason, and many Home users will stick to what they are used to. Enterprises won't bother wasting time and money to switch over to Windows protection permanently when it is cheaper/easier to stick with what they already have, and may need features not provided by Microsoft.

The new Exploit Protection for example has many bugs and in some of those cases, does not even work properly despite being configured and claiming to be working. Lack of documentation in many areas also causes issues for people, both general and enterprise users, to get things working correctly.

Windows Defender is pretty much base-line protection and no average home user wants to sift through configuration for it. Third-parties tend to be more convenient for the average home user.

Andy Ful said:
Please, post here about thoughts related to AVs tests.

I think that AV Test Lab results are fine and I think the Malware Hub is nice but I am not a fan of the YouTube tests.

My thoughts below.
AV Test Lab results:
- Take with a grain of salt as all tests.
- Product X flags X amount of samples/links or it didn't either statically/dynamically in comparison to vendors X, X, X, X,.... which flagged X amount of samples.

MalwareTips Hub results:
- Take with a grain of salt as all tests
- Product X flags X amount of samples/links or it didn't either statically/dynamically in comp........

Both of them to take with a grain of salt and can be helpful and provide insight, and neither of them reflect if a product is truly better than another 24/7 because each vendor has a good/bad day and there is no "best" AV.

That is my thoughts.

Andy Ful · Dec 9, 2017

Windows Defender has great potential, but Microsoft is known of spoiling good ideas. I would be happy to see both Defender and third-party AVs getting better when they compete with each other.

ForgottenSeer 19494 · Dec 9, 2017

DeepWeb said:
Excellent writeup. The key is this:

All the hooks, injections, false positives and extra vulnerability of a 3rd party AV just to go from 99.7% detection rate to 99.9% is simply not worth it in my opinion. Yes you give me that 0.2 percentage point extra in protection for all the other risks and breaking my favorite programs? Nope.

Windows Defender is catching up fast. I don't think that anyone will need a 3rd party AV by the end of 2018 especially now that they announced that they will release new builds of Windows Defender Antivirus every month. Comparing WD in December 2017 to WD in January 2017, it has improved at an exponential scale. Microsoft is learning fast and they are serious about making it superior to all of the alternatives.

The thing is that they have the best integration with the OS. Windows 10 is making more security minded feautures like ELAM, AMSI and sensors right into the OS. Guess who uses them? - Microsoft. Microsoft has a good data science team and who knows, they might even get telemetry for suspicious apps from people who already use a 3rd party AV to improve their own detections. But they have much work to do and i support their much needed tries.

Opcode said:
If you enable specific features then you sort of have to meet them half way, because some features cannot really be done using non-hacky methods. Just white-list the software you don't want affected.

A lot of people do not like Microsoft due to the telemetry (despite other AVs collecting data) and may have different needs. Windows Defender is likely to not rise in that much popularity compared to now for this reason, and many Home users will stick to what they are used to. Enterprises won't bother wasting time and money to switch over to Windows protection permanently when it is cheaper/easier to stick with what they already have, and may need features not provided by Microsoft.

The new Exploit Protection for example has many bugs and in some of those cases, does not even work properly despite being configured and claiming to be working. Lack of documentation in many areas also causes issues for people, both general and enterprise users, to get things working correctly.

Windows Defender is pretty much base-line protection and no average home user wants to sift through configuration for it. Third-parties tend to be more convenient for the average home user.

The telemetry is a good thing. No-one is watching your funny dog's videos or listening to your favorite music or laughing on your photos. People use the telemetry not only for ads like Google does, but also for fixing bugs in the OS and finding suspicious software. The most convenient is the built in which needs no configuration.

TairikuOkami · Dec 9, 2017

Opcode said:
A lot of people do not like Microsoft due to the telemetry

I find, that virtually all MS products are half-baked. Windows Firewall is great, but it has no self-protection and no notifications. WD might be good, but it is dull.

Andy Ful · Dec 9, 2017

TairikuOkami said:
I find, that virtually all MS products are half-baked. Windows Firewall is great, but it has no self-protection and no notifications. WD might be good, but it is dull.

Microsoft has a long way to learn from other vendors.about configuration GUI.

ForgottenSeer 19494 · Dec 9, 2017

TairikuOkami said:
I find, that virtually all MS products are half-baked. Windows Firewall is great, but it has no self-protection and no notifications. WD might be good, but it is dull.

It has notifications, but self-protection is poor or absent. They have to improve it. Also, they need to release more out of bound security updates, not only for their malware engine. They were saying that they will replace Patch Tuesday with release when ready model, still waiting for it. Microsoft Edge will update separately from builds, still waiting for it.

Deleted member 65228 · Dec 9, 2017

liubomirwm said:
The thing is that they have the best integration with the OS. Windows 10 is making more security minded feautures like ELAM, AMSI and sensors right into the OS. Guess who uses them? - Microsoft.

Third-party vendors also use these features if they would like to, the ones that are partnered with Microsoft. There are different programmes such as the Microsoft Virus Initiative (MVI - Microsoft helps third-parties integrate their security software with Windows), Microsoft Virus Information Alliance (VIA - telemetry/sample sharing among Microsoft and other vendors), Coordinated Malware Eradiation (CME) which third-party vendors can join, among others.

On that note, Microsoft do not necessarily have the best integration with the OS, regardless of developing it. Windows Defender uses techniques that other third-party vendors can usually do, too. For example, the real-time protection for Windows Defender relies on a File System Mini-Filter device driver, just like top vendors like Avast rely on to receive notifications regarding the file system operations. Microsoft may rely on techniques such as internal process protection however they also dish out special certificates to other vendors to receive the same treatment...

Windows Defender reportedly tends to behave quite heavily and noticeable in comparison to some other security software solutions on a regular basis based on many discussions I've read in the past (both old and recent). This isn't the case for me and of course not everyone but security software performance differs between each user/hardware and thus Windows Defender won't be appropriate for everyone either, third-party security solutions will always be needed.

liubomirwm said:
The telemetry is a good thing. No-one is watching your funny dog's videos or listening to your favorite music or laughing on your photos. People use the telemetry not only for ads like Google does, but also for fixing bugs in the OS and finding suspicious software. The most convenient is the built in which needs no configuration.

I'm not disagreeing with you, however a majority seem not to agree in my opinion. I see a lot of fuss over privacy nowadays. I'm not saying I care or do not care about it, but Microsoft got a big reputation with telemetry since the start of Windows 10... For the wrong reasons.

Microsoft can improve their work with Windows Defender and make things much more convenient and secure with time, but third-party vendors aren't just going to vanish. Microsoft isn't going to cater to home users for things that third-parties may be providing on the side which average users may like, and people also choose products for many different reasons (e.g. system performance for them, user interface design, trust in a specific vendor/customer support, etc.).

Microsoft do help other third-party vendors out on a regular basis. They don't hate third-party vendors and I'd say most third-party vendors don't hate Microsoft, but people will still pick third-party over Microsoft a majority of the time I think for a combination of reasons. Microsoft are trying to improve the base-line protection on Windows, they aren't going over-kill for average home users. They are mainly focusing on catering to enterprises and I doubt that will go so well unless it is elegant for one to switch and maintain with good customer support.

TairikuOkami · Dec 9, 2017

Opcode said:
I see a lot of fuss over privacy nowadays. I'm not saying I care or do not care about it, but Microsoft got a big reputation with telemetry since the start of Windows 10... For the wrong reasons.

Indeed, everyone does telemetry, but MS is the only company, which actually admits it and let users to customize it. MS updates its privacy support pages regularly, but for some people, it is not good enough and those people use gmail, which has in its EULA, that it can read their emails and do whatever it wants with them.

Manage connections from Windows operating system components to Microsoft services (Windows 10)
What's new in MDM enrollment and management

RoboMan · Dec 9, 2017

Nice thread here mate, well done.

As for my personal opinion, online AV testing is garbage. Hardly anyone commits to the cause and creates a real valid test because it consumes so many hours or days of work. Pretty much everybody does some crappy ten minute video executing 20 files at once and saying the antivirus fails. That's pretty much because any teen fanboy tries to be a star without even knowing how malware works nor having studied about security. We can't blame them to be honest, but we can take their tests with a tiny little grain of second hand marine salt.

As for "testing labs" almost all of them are moved by $$. Great products are missed just because they don't want to pay. And some of them are so moved by money that will star products because of the highest bank deposit. I encourage everybody who wants to see real testing to study the subject, either at university or courses online and do your own tests. You will find out eventually that all tests you've seen until now are useless.

jetman · Dec 9, 2017

No I don't really understand AV test results.

I look at three sites.... AV-Test, AV-Comparatives and SELabs. In general, all three rate Norton and Kaspersky, and some also rate Bitdefender.

How much the Labs are influenced by money/marketing I cant say.

I wouldn't hold much weight in one individual testing report from an AV Lab. But if you monitor general trends over a period of time and look at a range of different testing websites you can perhaps start to form a long-term opinion on which products might be good and which might be less good.

Who knows ??

Windows_Security · Dec 9, 2017

Avast cloud (or better machine learning part) identifies at least 75% of new samples correct (I known from the test Avast and VoodooShield did, Avast AI/ML identified 25% wrong). That maybe explains big difference of Avast score in reactive test (with cloud) and proactive test (without cloud) of AV-RAP
Virus Bulletin :: Latest RAP Quadrant

boredog · Dec 9, 2017

Opcode said:
I think that AV Test Lab results are fine and I think the Malware Hub is nice but I am not a fan of the YouTube tests.

My thoughts below.
AV Test Lab results:
- Take with a grain of salt as all tests.
- Product X flags X amount of samples/links or it didn't either statically/dynamically in comparison to vendors X, X, X, X,.... which flagged X amount of samples.

MalwareTips Hub results:
- Take with a grain of salt as all tests
- Product X flags X amount of samples/links or it didn't either statically/dynamically in comp........

Both of them to take with a grain of salt and can be helpful and provide insight, and neither of them reflect if a product is truly better than another 24/7 because each vendor has a good/bad day and there is no "best" AV.

YouTube test results:
- Simply don't like them at all but won't elaborate because there's no need for negativity!!

That is my thoughts.

And so you do not like cruelsisters tests on youtube? And BTW where has she been?

Peter2150 · Dec 9, 2017

I don't try and understand them. I ignore them.

Andy Ful · Dec 10, 2017

YouTube tests are not useful for AV scoring, but some of them are very interesting in some other aspects. I like especially some videos posted on MalwareTips (Wilderssecrity) forum, when they are not focused on AVs scoring, but rather on showing something interesting about AV security. Most of those videos cannot be called as homemade, because of their pedagogical quality.

509322 · Dec 10, 2017

The average test result reader understands: pictures but not what the picture actually means - so no, they don't understand AV test results.

If a person does not understand:

1. statistics
2. sampling
3. what is being tested
4. how it is being tested

then they do not understand AV lab testing and the reported test results.

Just look at AV lab test result threads, it is readily apparent that people do not know.

"Windows Defender is awesome !" in an AV-Comparatives thread is sufficient evidence.

Also, arguments about AV X and Y, where the difference between X and Y are only a few percentage points - sufficient evidence that the thread combatants do not understand what they arguing. A few % difference - say up to 3 or 4 % - between two AVs in absolute performance over a large sample pool will essentially result in 0 statistical significance in practical, day-to-day use where malware is encountered on a piecemeal basis.

shmu26 · Dec 10, 2017

Andy Ful said:
Many people think so. I have also thought, that it is a simple thing to understand. But ..., I was wrong.
I realized, that it is possible that AV with a lower detection result (static + dynamic), can give a better protection than another AV with a higher detection result.
It sounds like a kind of paradox, but please read first the post, and then you can decide for yourself.
.
The possibility of using Artificial Intelligence (AI) locally and in the cloud (extended AI), made AV testing more complicated, than a few years ago. Artificial Intelligence can recognize the suspicious behavior on the infected computer and create the malware (postinfection) signature. The more signatures it creates, the smarter it is, and the better is AI behavior monitoring. So, the better signatures will be created in a shorter time. For example, BitDefender 2017 Anti-Ransomware module, can create in the cloud the postinfection signature up to 30 minutes. Next, it takes approximately 3 hours to propagate it on all endpoints not connected to the cloud via product update. So, in the BitDefender 2017 AV cloud, the 0-day ransomware in the wild will live only up to 30 minutes (thanks to postinfection signature), as compared to several hours in the standard AV clouds.
With the above scenario (AI included), the fact of infecting before the test one of the computers connected to the AV cloud, can have a substantial influence on the AV malware detection afterward. It follows from the fact, that postinfection signatures can be created by AI in the cloud in some minutes.
.
The conclusion is that actual real-world protection 0-hour (0-day) test, should:

Contain only the 'true real-world samples' that already infected every tested AV cloud (hard to be done in practice).

or

Make use of the repeated tests.

.
In repeated tests, the malware samples that bypassed AV protection, are checked again after some minutes. This procedure should be repeated for all not-detected samples. For some samples, several repetitions will be required up to one hour (for 0-hour tests). Those samples that could bypass AV protection in all repetitions are counted as undetected (valued as 0), otherwise they are counted as partially detected (valued as something between 0 and 1).
The malware detection will be valued slightly less than 1, when it is detected just a few minutes after the first test.
.
Why some AVs are scoring high in tests.
BitDefender example for 0-hour real-world malware test.

The non-standard signatures of 'malware caught by Bitdefender AI on customer computers' are excellent.

The standard (before execution) Bitdefender signatures are excellent.

The non-signature protection is excellent.

.
In this case, the result of 0-hour protection standard test will be excellent.
Also, from the fact that BitDefender has decent standard signatures, it follows that the real-world test samples, that happened to be new for BitDefender, are very rare (if any).
.
Microsoft Defender example for 0-hour malware test.

The non-standard signatures of 'malware caught by Defender AI on customers computers' are (probably) excellent.

The standard (before execution) Defender signatures are poor.

The non-signature protection is (probably) excellent.

The number of test samples that will require creating postinfection signatures by Defender AI can be statistically important.

.
The point 4. is the key factor of the bad opinion about Defender. If Defender could block 99.99% of malware samples then AI should create the postinfection signatures very rarely (0.01%). As we know this is not the case, and Defender cannot block malware so efficiently in known tests. That is why postinfection signatures are important for Defender.
.
The detection results (static + dynamic) will be lower, for Defender cloud than in reality, because the results for samples from point 4. are not properly valued in the standard tests. They are always counted as undetected, but they should be counted as partially detected, because some minutes later that malware will be mostly stopped in Defender Cloud (AI creates postinfection signature).
I used the phrase '(probably) excellent', because the protection related to Defender AI (locally and in the cloud) was not directly measured (even when in theory it should work excellently). On the other side, the analogical Anti-Ransomware protection in BitDefender 2017 works excellently, so '(probably) excellent' for Defender is not just a stupid assumption. One should also bear in mind, that Defender can use Windows massive telemetry about the system events, and Microsoft has enormous resources to make Defender AI the best.
.
The 0-hour malware files have an important influence on 0-day detection scores, because the rate of infection is highest in the first few hours after the malware shows itself in the wild (about 30% infections happen in the first 4 hours).
Furthermore, when you read the malware test reports, you can see the substantial differences between AVs when testing the 0-day detection and the very small differences when testing the detection of malware discovered over a period of some weeks. So in fact, the better the 0-day result the better is a combined, final AV score.
Generally, AVs which use postinfection telemetry to create malware signatures on the fly, will have underestimated protection scores, especially when the shorter is the time required to create a malware signature, and worse the standard malware signatures. In real life, the worse standard signatures will be compensated by postinfection signatures created by AI.
.
So, are the AVs standard detection tests useless? I do not think so.
Let's take a Microsoft Defender example for one-month malware samples.

The non-standard signatures of 'malware caught by Defender AI on customers computers' are (probably) excellent.

The standard (before execution) Defender signatures are good (better than 0-hour or 0-day).

The non-signature protection is (probably) excellent.

The number of test samples that will require creating postinfection signatures by Defender AI, is not statistically important.

The one-month detection result should be near to excellent, because the number of malware samples that should be counted as partially detected (instead of undetected) is not statistically important now. Also, one-month standard Defender signatures are better than 0-hour or 0-day.

.
As an example, let's take the AV-Test one-month detection results (static + dynamic, average on a year interval October 2016 - October 2017):
BitDefender, FSecure, Kaspersky, McAffe, Norton, Trend Micro ~ 100%
Avast, Avira, Eset, Microsoft Defender . . . . . . . . . . . . . . . . . . . . ~ 99.7% - 99.9%
https://www.av-test.org/en/antivirus/home-windows/
.
As the second example, let's take the AV-Comparatives one month real-world detection results (static+dynamic, average on the interval of 4 months: July - October 2017, user dependent actions counted as infected):
Avira, BitDefender, FSecure, Kaspersky, Norton, Trend Micro ~ 99,6% -100%
Avast, Eset, McAffe, Microsoft Defender . . . . . . . . . . . . . . . . . ~ 98,9% - 99,5%
Real-World Protection Test - AV-Comparatives
.
The differences are very small, and one of the best AVs in the AV-Test scoring (McAfee) got the worse result in AV-Comparatives scoring. Also, BitDefender, FSecure, Kaspersky, Norton, and Trend Micro have excellent scores in both tests.
One should adopt other protective factors in purpose to differentiate AVs, especially based on the anti-0-day AV capabilities (like CyberCapture in Avast, 'Application Control' in Kaspersky, 'Advanced Threat Control' in BitDefender, DeepGuard in F-Secure, etc.), and other unique features.
.
It would be interesting to compare Avast cloud with Defender cloud.
In Avast 2017, the CyberCapture feature was introduced to fight the 0-day malware. If the EXE or DLL file is recognized as suspicious, then CyberCapture locks the file, and can upload it to the cloud for detailed analysis in a controlled environment. After the analysis is finished, which can take up to about 2 hours, the file is unlocked - and if malicious the signature of the malware is created on the fly, and Avast will quarantine the file. That feature gave Avast excellent detection results, as good as for the best AVs. CyberCapture is an idea to protect all users at the price of some inconvenience (file locked up to 2 hours). CyberCapture does not create postinfection signatures (the file is locked until analysis will be finished).
.
Defender in Window 10 adopted the known military solution: protect millions by sacrificing a few. The time for cloud analysis was shortened to 10 seconds (can be extended to 60s). If the file was recognized as dangerous the malware signature is created. After this, the file is unlocked and can be quarantined (infected) or executed (not infected). Such analysis is not as deep as in the case of Avast CyberCapture, so some malware files can slip through the detection procedure to infect the computer. Yet, the telemetry about suspicious system events is still continued, and analyzed by the local AI models and cloud AI extended models. When the signs of infection are recognized, then all data is reanalyzed in the cloud. If some files are recognized as malicious, then malware signatures are created and files are quarantined (the average time of creating postinfection signatures on the fly is not known). Defender AI can create both before-execution and postinfection signatures on the fly.
.
The final words.
I should underline, that the above reasoning is related to home users protection. Because of the targetted attacks and fast malware propagation in the big local networks, postinfection signatures are not so useful for Companies/Enterprises.
The second point is that I am not accusing anyone of publishing the wrong detection test results. I accept that detection results are correct within the boundaries of applied testing procedures.
One should remember anyway, that detection results for some AVs, cannot be easily transferred to protection results, especially when malware signatures can be also created on the fly by AI (in some minutes) on the basis of postinfection telemetry (Defender, BitDefender, etc.).
.
YouTube tests.
Homemade tests are not very useful, because they have many cons:

The test may be done by AV fanboy, so only the tests for an advantage of the concrete AV may be published.

The malware samples are not real-world. They are taken from some malware collection sites, and many of samples have no chances to infect any computer.

The number and diversity of samples are usually too small to be statistically representative.

Often only one AV is tested, so it is impossible to compare the results with other AVs (on the same pool of samples).

If there is more than one AV, then they are not tested at the same time - some hours can have a substantial influence on the results.

Sometimes, several AVs are tested one after one without refreshing the system, so in fact, they are tested on different system state.

.
MalwareTips tests.
I like them, because they show how the concrete AV can fight the concrete malware file. All samples have available the analysis on hybrid-analysis.com and virustotal.com . From those tests, one can see the strong and weak points of AV protection. For example, MalwareTips tests for Avast (from some last months) show that it has actually a very strong 0-day protection for EXE files (CyberCapture module), but not so strong protection against scripts and scriptlets.
They also show that most of the malicious documents use embedded or external scripts (scriptlets) to download and execute payloads. The script trojan-downloaders are often highly obfuscated, but after de-obfuscation is evident that in many cases they differ only with links to malware sites. Those are only some examples, the research potential is great.
So, those tests can be much more useful, as compared to tests on AVs scoring results.
.
Please, post here about thoughts related to AVs tests.

Hi @Andy Ful , this looks like a very interesting post, maybe I am tired or something, but could you please summarize the main point? If you could give us a short explanation about why a lower detection result might actually mean better protection?

Windows_Security · Dec 10, 2017

Lockdown said:
Also, arguments about AV X and Y, where the difference between X and Y are only a few percentage points - sufficient evidence that the thread combatants do not understand what they arguing. A few % difference - say up to 3 or 4 % - between two AVs in absolute performance over a large sample pool will essentially result in 0 statistical significance in practical, day-to-day use where malware is encountered on a piecemeal basis.

Just looking at AV-Test: October results for Windows Defender is 96.3% (1) and for Bitdefender rated top by AV-Test is 100% (2) When the statistical relevance of 3 to 4 percent protection is that low (near zero), why bother to install a third party Anti-virus? Just keep Windows Defender.

Behind the test results the industry average is showed by AV-Test. When AV-Test info is true that the zero day malware protection average is indeed 99%, why would market research show an annual growth of 12-15% on security spending by companies (3)

1: Link to Windows Defender test results: AV-TEST – The Independent IT-Security Institute
2: Link to Bitdefender test results: AV-TEST – The Independent IT-Security Institute
3. Link to Market forecast cyber security: The Cybersecurity Market Report covers the business of cybersecurity, including market sizing and industry forecasts, spending, notable M&A and IPO activity, and more.)

ForgottenSeer 19494 · Dec 10, 2017

Windows_Security said:
When AV-Test info is true that the zero day malware protection average is indeed 99%, why would market research show an annual growth of 12-15% on security spending by companies (3)

1: Link to Windows Defender test results: AV-TEST – The Independent IT-Security Institute
2: Link to Bitdefender test results: AV-TEST – The Independent IT-Security Institute
3. Link to Market forecast cyber security: The Cybersecurity Market Report covers the business of cybersecurity, including market sizing and industry forecasts, spending, notable M&A and IPO activity, and more.)

It either takes more money and resources to keep these 99% when malware sophisticates with time or they just want to earn more. I believe it's a combination of the two.

Search

Search

Do you really understand AV test results?

Andy Ful

From Hard_Configurator Tools

DeepWeb

Level 25

Deleted member 65228

Andy Ful

From Hard_Configurator Tools

ForgottenSeer 19494

TairikuOkami

Level 40

Andy Ful

From Hard_Configurator Tools

ForgottenSeer 19494

Deleted member 65228

TairikuOkami

Level 40

RoboMan

Level 38

jetman

Level 10

Windows_Security

Level 24

boredog

Level 9

Peter2150

Level 7

Andy Ful

From Hard_Configurator Tools

509322

shmu26

Level 85

Windows_Security

Level 24

ForgottenSeer 19494

You may also like...