Make your video test requests!

@Shadowra tested v6 last year. Based on my experience with it, I highly doubt it would do any better now.
Agree, maybe it's not worth the time then.
 
Hey @Shadowra, SiriusLLM 0.57 public is almost ready, I am going to release it sometime today. Can you please test 10-20 (or so) super tricky malware samples and if possible, some super tricky false positive samples to see how well we do with false positives. This will be a great starting benchmark, please use the Model 1. Model 2 is great, but it tends to overthink things a little, but we are going to combine them in to one model soon. Also, if possible, can you test the same samples against a couple of other Deep Learning malware engines? The ones I know of do not use LLM's, but they do use deep learning algos, so it is a pretty fair comparison. Actually, the only other LLM malware analysis engines that I have found is BD doing some LLM analysis on the backend, and a French company who is working on LLM malware analysis, but have not released a public version. Thank you!
 
Hey @Shadowra, SiriusLLM 0.57 public is almost ready, I am going to release it sometime today. Can you please test 10-20 (or so) super tricky malware samples and if possible, some super tricky false positive samples to see how well we do with false positives. This will be a great starting benchmark, please use the Model 1. Model 2 is great, but it tends to overthink things a little, but we are going to combine them in to one model soon. Also, if possible, can you test the same samples against a couple of other Deep Learning malware engines? The ones I know of do not use LLM's, but they do use deep learning algos, so it is a pretty fair comparison. Actually, the only other LLM malware analysis engines that I have found is BD doing some LLM analysis on the backend, and a French company who is working on LLM malware analysis, but have not released a public version. Thank you!

Does Bitdefender use this method?
It could be an interesting comparison, I've never seen it :)
 
@Shadowra I know there have been a couple posts asking to re-test F-Secure, but is it something where we should wait for another build or two (25.7 or even up to v26?) before we consider testing it again? I'm not sure at what point some of the bug fixes and improvements will enhance the security besides the Chromium browser, banking protection and the VPN improvements? I'm still waiting to see a release note (or beta note) that says "improved Behavior Detection". But, I could absolutely be wrong as far as the release note fixes and improvements since you tested v25.2 that it could actually have improved?

I'm not trying to overrule or be heavy handed of what others have asked for, so that's up to you of if and when :)

 
Does Bitdefender use this method?
It could be an interesting comparison, I've never seen it :)
I did tons of research and supposedly they are using LLM malware analysis on their backend, but it was not a completely reliable source. My guess is that they are experimenting with the tech on the backend and using the LLM verdict for tiebreakers. The old binary classification like VoodooAi and everyone else uses is still relevant today for the most part, as are the old deep learning models. I am extremely curious why everyone seems to be holding off on LLM malware analysis. It took a little tinkering, but once I got it right, it was right. There is another company that is trying to do local LLM analysis on an endpoint local LLM model, whether it has a GPU or not. I will be super impressed if they can get that to work, mainly because they will be super limited in what models they will be able to use, so they simply will not have enough parameters in their models to achieve adequate efficacies. But if they can pull it off, that would be the way to do it. That way you do not have to send the samples to the super computers in the datacenters, like we have to do... and there would be no compute costs. I do not think it is possible, but I will play around with it some more to see if we can get it to work. But for now I am quite happy with our config.

There are also a few academic papers on LLM malware analysis, and the ones I read pretty much said that the tech is not quite advanced enough, but that it will be soon. But then again, these papers were like 6-12 months old, so we might have hit the timing right on the head (accidentally of course ;)).
 
@Shadowra I know there have been a couple posts asking to re-test F-Secure, but is it something where we should wait for another build or two (25.7 or even up to v26?) before we consider testing it again? I'm not sure at what point some of the bug fixes and improvements will enhance the security besides the Chromium browser, banking protection and the VPN improvements? I'm still waiting to see a release note (or beta note) that says "improved Behavior Detection". But, I could absolutely be wrong as far as the release note fixes and improvements since you tested v25.2 that it could actually have improved?

I'm not trying to overrule or be heavy handed of what others have asked for, so that's up to you of if and when :)


F-Secure hasn't made too many changes yet, so it's best to wait a little longer :)

I did tons of research and supposedly they are using LLM malware analysis on their backend, but it was not a completely reliable source. My guess is that they are experimenting with the tech on the backend and using the LLM verdict for tiebreakers. The old binary classification like VoodooAi and everyone else uses is still relevant today for the most part, as are the old deep learning models. I am extremely curious why everyone seems to be holding off on LLM malware analysis. It took a little tinkering, but once I got it right, it was right. There is another company that is trying to do local LLM analysis on an endpoint local LLM model, whether it has a GPU or not. I will be super impressed if they can get that to work, mainly because they will be super limited in what models they will be able to use, so they simply will not have enough parameters in their models to achieve adequate efficacies. But if they can pull it off, that would be the way to do it. That way you do not have to send the samples to the super computers in the datacenters, like we have to do... and there would be no compute costs. I do not think it is possible, but I will play around with it some more to see if we can get it to work. But for now I am quite happy with our config.

There are also a few academic papers on LLM malware analysis, and the ones I read pretty much said that the tech is not quite advanced enough, but that it will be soon. But then again, these papers were like 6-12 months old, so we might have hit the timing right on the head (accidentally of course ;)).

The only one I know of that could be identical to SiriusLLM would be DeepInstinct or SentinelOne, that could be interesting.
I'm planning to do that as soon as possible ;)
 
F-Secure hasn't made too many changes yet, so it's best to wait a little longer :)



The only one I know of that could be identical to SiriusLLM would be DeepInstinct or SentinelOne, that could be interesting.
I'm planning to do that as soon as possible ;)
Yes, that would be a really cool comparison ;).
 
New antivirus software - crAntivirus
This looks interesting as it's based on Antiy which is only available in Chinese, but this gives you the option to install it in English. However I've tried three beta versions, including the most recent one, in the last few weeks on two different computers and when I open it, I only see icons and no text - even after changing the language to Chinese. Also, when I get alerts, they are in Chinese. I just uninstalled and reinstalled it. This time I selected Chinese as the language when installing it, rather than changing it in the settings after install it and still the UI has no text. It's a shame as I really wanted to try this. One thing to note, is that according to Chinese security forum Kafan, at the moment it doesn't support Windows 11.

Anity which I have tested, is good at detecting older malware, but I have no idea how it fares against recent malware.
 
The only one I know of that could be identical to SiriusLLM would be DeepInstinct or SentinelOne, that could be interesting.
I'm planning to do that as soon as possible ;)
Here’s a concise breakdown of challenging samples for testing @danb LLM, designed to highlight key areas of detection strength and weakness:

Tricky Malware Samples:

•Evasive Code: Malware using polymorphic/metamorphic code, heavy packing/encryption, or anti-analysis techniques (anti-VM, anti-debugging). The core malicious logic is hidden or constantly changing.

•"Living Off The Land" (LotL): Malicious use of legitimate system tools like PowerShell, WMIC, or Certutil. The tools themselves are benign; the danger lies in their contextual misuse.

•Complex Obfuscation: Code with convoluted control flow, junk instructions, hidden strings, or dynamic API calls. This makes raw code analysis difficult.

Tricky False Positive Samples:

•Benign Tools Mimicking Malware: Legitimate admin software (e.g., remote access tools, network scanners, pen-testing utilities) that perform actions similar to malicious activity (e.g., network connections, registry changes, process injection).

•Legitimate Installers/Updaters: Software that legitimately modifies system files, creates services, or downloads components, resembling malware installation.

•Obfuscated Benign Code: Lawful applications or scripts that use packing, compression, or intentional obfuscation (for IP protection) which might trigger generic suspicious patterns.

•User Scripts: Personal automation scripts that perform unusual system interactions (e.g., mass file operations, non-standard downloads) but are entirely benign in intent.

Focusing on these areas will provide a robust benchmark for SiriusLLM's ability to discern true malicious intent from complex or ambiguous code.
 
hi
and thanks for awesome reviews
a question and request
is there a difference between Microsoft defender paid version vs windows security Free
if there is a difference can you make a quick test
thanks

No difference in terms of detection.

The paid version just has an EDR, which is useless for home users.
 
Here’s a concise breakdown of challenging samples for testing @danb LLM, designed to highlight key areas of detection strength and weakness:

Tricky Malware Samples:

•Evasive Code: Malware using polymorphic/metamorphic code, heavy packing/encryption, or anti-analysis techniques (anti-VM, anti-debugging). The core malicious logic is hidden or constantly changing.

•"Living Off The Land" (LotL): Malicious use of legitimate system tools like PowerShell, WMIC, or Certutil. The tools themselves are benign; the danger lies in their contextual misuse.

•Complex Obfuscation: Code with convoluted control flow, junk instructions, hidden strings, or dynamic API calls. This makes raw code analysis difficult.

Tricky False Positive Samples:

•Benign Tools Mimicking Malware: Legitimate admin software (e.g., remote access tools, network scanners, pen-testing utilities) that perform actions similar to malicious activity (e.g., network connections, registry changes, process injection).

•Legitimate Installers/Updaters: Software that legitimately modifies system files, creates services, or downloads components, resembling malware installation.

•Obfuscated Benign Code: Lawful applications or scripts that use packing, compression, or intentional obfuscation (for IP protection) which might trigger generic suspicious patterns.

•User Scripts: Personal automation scripts that perform unusual system interactions (e.g., mass file operations, non-standard downloads) but are entirely benign in intent.

Focusing on these areas will provide a robust benchmark for SiriusLLM's ability to discern true malicious intent from complex or ambiguous code.
Thank you for these amazing ideas / insights! I was thinking more along the lines of tricky, as in just really good, unique samples that are difficult to render the correct verdict. But this is an interesting list of items we need to keep in mind as well, so thank you! I responded to each one to let you know the current state.

•Evasive Code: SiriusLLM should already perform well.

•"Living Off The Land" (LotL): This is more up to the cybersecurity software that SiriusLLM is integrated into. Although, thank you for mentioning this, you already have me thinking of some really cool ideas for the CyberLock, DefenderUI Pro and WDAC Lockdown integrations.

•Complex Obfuscation: SiriusLLM should already perform well.

Tricky False Positive Samples: SiriusLLM already performs extremely well with false positives. I have tested the hell out of it, with both benign and malicious samples, and I am simply astonished. Especially since this is the baseline and it is only going to get better from here.

•Benign Tools Mimicking Malware: Yeah, there is nothing we can really do about this. However, someone named Andy has suggested a couple of times that we auto block remote admin tools and let the user know that while it is a safe file, it is risky because others may have access to your system.

•Legitimate Installers/Updaters: If a legit file is tampered with, SiriusLLM will know and react accordingly. Having said that, we need to unpack files that contain other executables, scripts, etc, and analyze the contents.

•Obfuscated Benign Code: So far this has not been an issue. SiriusLLM has amazed me when it comes to this. Now, if a script is so obfuscated that it is unreadable, SiriusLLM is instructed to consider this in the verdict.

•User Scripts: SiriusLLM already performs extremely well. Do this... go to your favorite AI and ask them to create a funny .bat, .vbs, or whatever kind of script. And tell them to make it unusually long. Then analyze it with SiriusLLM and see the result ;). And it does not have to be a funny script, try any of the scripts you already have. BTW, while analyzing malware samples, there was this one Microsoft admin script that was super long, I believe it was a .vbs script downloaded from MalwareBazaar. To make a long story short, SiriusLLM spotted a few lines of code that were added to the standard Microsoft Script, that were malicious. It blew my mind.