Testing real-time protection of antiviruses with 10.124 Sample

Firstly, I wont care the ones who reply ChatGPT's answers like @Divergent anyway and we surely also know you wont be infected with 10000 virus, Are some of you really serious? This test show which av have best rate on numbers. Secondly, We will care some of your recommendations surely. We will wait 10 seconds per sample to verify it's blocked/executed in next test, But i want to say, this test still can show something, It's not unvalid. Running samples from tool doesn't change test results (At least significantly, otherwise test-tools like MALX totally must be wrong). We will develop our tool. @Trident is right at some point, Malwares dont always comes from suspicious source and this test is like Malware Protection Test.



My Proposition
The first possibility is conducting the test like in the opening post, but changing the winning criteria. The winner is the AV, which is not bypassed by any sample. However, this will require more effort to check if the system is clean.
  1. Use Norton Power Eraser and maybe some other similar tools.
  2. Use Autoruns to confirm that popular persistence methods were not applied.
  3. Use a tool for checking DLL injections.
  4. Use a tool to identify C2 connections.
  5. Add some info about the purpose of the test and the methodology used.
Ty, We will care this too.
 
Last edited:
And we cant open one-to-one because this is impossible for some reasons:
1-It's so difficult even when you have 100-200 sample. Think 1000-2000 then.
2-Slow, this may cause to provide cloud-based system like KSN time to collect information, may change test results.
3-We think 100-200 sample may not be enough to provide accurate test results.
 
But i want to say, this test still can show something, It's not unvalid. Running samples from tool doesn't change test results (At least significantly, otherwise test-tools like MALX totally must be wrong).
Your test is not invalid. A security solution is deployed (and often purchased) to block malware. Whether the malware is executed through tools, through rundll, through Microsoft Office with malicious macro, through Explorer, the malware should be blocked.

Your test holds some value because you compared a few products under the same circumstances.

And we cant open one-to-one because this is impossible for some reasons:
1-It's so difficult even when you have 100-200 sample. Think 1000-2000 then.
2-Slow, this may cause to provide cloud-based system like KSN time to collect information, may change test results.
3-We think 100-200 sample may not be enough to provide accurate test results.
Every test can provide information. The problem with malware packs is, they contains loads of PUPs and you are scanning with software that has VERY aggressive stance on PUPs, in fact MalwareBytes not once or twice has had problems with software developers. This gives people room to wiggle and criticise your test. I would suggest that you take care to eliminate PUPs, snake oil and all that. Maybe when you see an installer (PUP) close it and do not install.

A test with a small set of samples can still show something. Just like when checking drinking water, you take a sample from a small amount, you don’t check every drop. That’s the same logic. As long as you can choose the right samples.

And last but not least, consider a hybrid test, real world + malware protection, this will have better value and again, won’t allow people to wiggle and doubt your results.
 
Firstly, I wont care the ones who reply ChatGPT's answers like @Divergent anyway and we surely also know you wont be infected with 10000 virus, Are some of you really serious? This test show which av have best rate on numbers. Secondly, We will care some of your recommendations surely. We will wait 10 seconds per sample to verify it's blocked/executed in next test, But i want to say, this test still can show something, It's not unvalid. Running samples from tool doesn't change test results (At least significantly, otherwise test-tools like MALX totally must be wrong). We will develop our tool. @Trident is right at some point, Malwares dont always comes from suspicious source and this test is like Malware Protection Test.




Ty, We will care this too.
If you ask why 10 seconds, waiting too long between viruses could unnecessarily prolong the test, and systems like KSN could take advantage of this.
 
  • Like
Reactions: Khushal
Firstly, I wont care the ones who reply ChatGPT's answers like @Divergent anyway and we surely also know you wont be infected with 10000 virus, Are some of you really serious? This test show which av have best rate on numbers. Secondly, We will care some of your recommendations surely. We will wait 10 seconds per sample to verify it's blocked/executed in next test, But i want to say, this test still can show something, It's not unvalid. Running samples from tool doesn't change test results (At least significantly, otherwise test-tools like MALX totally must be wrong). We will develop our tool. @Trident is right at some point, Malwares dont always comes from suspicious source and this test is like Malware Protection Test.




Ty, We will care this too.
My replies are accurate assessments of real-world experience. Attempts to deflect from this do not change the underlying facts.
 
  • Like
Reactions: simmerskool
I'm just a noob, far be it from me to play the arbiter of truth at MWT.com. But I am a huge fan of logic, even though I lack the skills on display at this forum.

OK I'll concede to your knowledge :"This isn't the proper way to test"... BUT

If I were testing using the horde method, and I tested 5 well known, respected AV's, and threw 50k bad actors at them all , and 2 survived. That would still seem to me, to say something about the 2 that made it through the test.

Then if those same 2, could manage a proper test, with flying colors, that would add even more evidence to the results.

OR am I missing this by a long shot?
Finally a human who can logically think.
 
  • Hundred Points
Reactions: Trident
Could you please share the sources that contain your information with us?
For a practical education in cybersecurity, start with industry resources like the SANS Institute to learn about real-world threats and their impact. Then, analyze security product architecture and responses, and finish by deconstructing malware to understand its triggers and infection methods.
 
We will wait 10 seconds per sample to verify it's blocked/executed in next test,
Because of the way some AV work, 10 seconds between malware sample launches is too little. Some AV require up to 3 minutes, sometimes longer, to react.

Your limitation is the AV response time.

Y'all have to figure it out for yourselves. Use 10 seconds as the time threshold and there are those that will state your testing is still invalid.

Don't use 10,000 samples. Use less than 100 and an adequate time threshold between malware sample executions.
 
Use 10 seconds as the time threshold and there are those that will state your testing is still invalid.
Indeed, in Panda you can choose the response time for the file to be locked, it starts at 10 secs, if the file is uploaded and analyzed online, it can take a bit longer, but it is more precise than just a signature check.
 
3-We think 100-200 sample may not be enough to provide accurate test results.

That depends on the differences in scores between tested AVs.

When testing 10000 samples of 1-4 week old malware, all AVs that failed on 0-8 samples are in the winning cluster (from AV-Comparatives tests). But those tests do not depend on the order of executed samples, so in your test, the winning cluster can be much wider. In your test, it is impossible to conclude how many of the total 10000 samples bypassed AVs, so we cannot conclude if the results are significant or pretty much random. The situation could improve if you made 10 such tests on the same group of AVs, and in most tests, one AV was a winner.

If you test 100 fresh samples (less than one day old), the AVs that failed on 0-2 samples are usually in the winning cluster. It is easier to identify the awarded AVs. The test is doable even with manual execution, because there are usually about 10 samples to execute after scanning.
However, the problem is finding fresh samples in a short time and quickly conducting the test.
 
Last edited:
3-We think 100-200 sample may not be enough to provide accurate test results.
Use 100 to 200 samples with 15 or less AV engine detections. That's way more valuable to your test thread readers than testing against 10,000 samples - the majority of which are old and will be detected. Using any malware older than 3 days adds little, if any value, to your testing.

If all the AVs you test already detect the 3 day old samples on Virus Total or elsewhere, then what benefit is provided to the reader of your test thread other than to confirm that the AV continues to detect the malware 3 or more days later?

If your objective is to only test conveniently, to your comfort level, then it does not provide much value. And that is not a personal insult. It is applicable to even professional testers and certified AV test labs that charge pimped-out top model Tesla prices for each of their tests.

The point is that volume of detection is not a very good indicator of AV quality unless you test ALL AV solutions on the market and make a % detection comparison across them all for each test that you perform.

VirusSign malware packs are notorious for old malware samples.

It takes effort to get AV testing right. For one samples need to be selected manually and that takes a lot of time and effort.

Whether you agree or not is entirely your prerogative, but I'm giving you infos that will separate your tests from the crowd - including both AV test labs, Youtube testers, and amateur MT testers.

These are suggestions from a person "in the know."
 
Last edited by a moderator:
Don't use 10,000 samples. Use less than 100 and an adequate time threshold between malware sample executions.
Many samples can be tested but it will require a lot of time and potentially more than one person to do the test, it will have to be done in the duration of certain period of time. For example, every day, 10-15 samples of different formats and vectors.

That’s how I like it done
Thread 'McAfee Protection (Plus Plans, Total Protection, LiveSafe)'
App Review - McAfee Protection (Plus Plans, Total Protection, LiveSafe)
 
Many samples can be tested but it will require a lot of time and potentially more than one person to do the test, it will have to be done in the duration of certain period of time. For example, every day, 10-15 samples of different formats and vectors.

That’s how I like it done
Thread 'McAfee Protection (Plus Plans, Total Protection, LiveSafe)'
App Review - McAfee Protection (Plus Plans, Total Protection, LiveSafe)
Good, meaningful testing is not easy. Ask @Adrian Ścibor . Designing the test, adjusting it during implementation and execution, having to scrap a test and start all over again... - oh, how testers who do their utmost to get the testing right suffer.

Downloading VirusSign malware packs for convenience is not it. All those sample are old, even if there's a small percentage that remain FUD.

Which test is more valuable?

Downloading and test executing 10,000 or 1,000,000 old malware samples?

I argue both are the same and provide limited value, overall.

PS - I would also argue that your approach is the best for testing. It provides the greatest insights, and therefore the greatest value.

PS - Download and test malware that is old enough where AV vendors purged the signatures for it (or its entire class, for multiple legitimate reasons - the least of which is to reduce signature database size) and it will be FUD. Then testers and readers are outraged and ask "How can this malware sample that is 16 years old not be detected or blocked by modern AV?" Such is the level of knowledge and sophistication of those that test and the readership.
 
Last edited by a moderator:
  • Like
  • Hundred Points
Reactions: roger_m and Trident
Good, meaningful testing is not easy.

Downloading VirusSign malware packs for convenience is not it. All those sample are old, even if there's a small percentage that remain FUD.

Which test is more valuable?

Downloading and test executing 10,000 or 1,000,000 old malware samples?

I argue both are the same and provide limited value, overall.
I completely agree, VirusSign packs often include file infectors that were prevalent 15 years ago like sality and so on. It is more valuable when samples are checked and studied. And a broad range of them.
 
Finally a human who can logically think.
If you want to be like this :

  • Want to win the discussion no matter what.
  • Can’t accept other forum members thoughts and replies.
  • Accuse people of using ChatGPT.
  • Can stand that other forum members know way more of testing than you.
Then please stay away and stop answering!
You have polluted your own topic.
 
Where can i download proper samples? I just know VirusSign, They released samples in this month?
This forum (i think still does) has a dedicated Malware Analysis Section but takes some doing to become integrated into that group. They are SERIOUS foulware hounds. Now online, when i was doing my FUN leeches of that gunk for local testing, MalwareBazaar was one of mine. You can Google and eventually nail a site or two, or use a dedicated machine to dragnet those suckers for testing purposes. The crap is vicious!