Advice Request I am head of research at Emsisoft. Ask me anything! :)

Please provide comments and solutions that are helpful to the author of this topic.

Status
Not open for further replies.

Fabian Wosar

From Emsisoft
Thread author
Verified
Developer
Well-known
Jun 29, 2014
260
"AI" Next Gen AVs...Hype or real?
There is no AI AV out there. All of them use human-assisted machine learning, which by its very definition is not AI. Machine learning has been in use by pretty much every single major AV (what they call "legacy") in one form or another for literally over a decade. The oldest and most prominent one being systems that automatically recognise malware families, clustering samples and extracting appropriate signatures from them.

For example, we use machine learning extensively inside the anti-malware network. Our signature generation tools also automatically suggest certain functions and code fragments that would make good signatures.

So all those fancy machine learning based technologies are available in one way or another to most traditional anti-viruses as well. Except that they, in addition, also have all those other technologies at their disposal: Emulation, behaviour monitoring, signature scanning, reputation-based anomaly detection etc.. You know, the ones that traditional AVs always had, making them, in general, more flexible.
 
Last edited:

Scorpion Illuminati

Level 2
Verified
Apr 14, 2017
73
PHP is just horrible. It doesn't scale particularly well compared to DotNet Core or even Python for example. Go is really interesting and definitely a good pick. Especially if you want to go into orchestration and devops. :)
Does your dev team use git? I know the basic commands(add, commit, push, branch, status etc.) wanna know if it will help me.
 
  • Like
Reactions: show-Zi

Fabian Wosar

From Emsisoft
Thread author
Verified
Developer
Well-known
Jun 29, 2014
260
Talking about signature tools. I almost forgot. This is, for example, one of the tools we developed internally. It's called "Signature Maker". It's a clever name, I know. It's kind of like an IDE, except for creating detection signatures for our scan engine:

210180


In general signatures for the Emsisoft engine are essentially functions that are being called by the scan engine depending on certain filter flags, like the file type for example. The signature flags you can see on the right are pretty much functions that perform certain tests. We can match signatures against certain version information fields for example or based on specific PE header fields. Things like imported APIs or exported APIs. But also more advanced information. Programming languages like .NET or Delphi, for example, leave a bunch of meta information behind, that our scan engine is capable of parsing and use as flags and information to feed into the actual detection functions (which is what signatures for our engine actually are).

Fields can be matched using a variety of methods. The most obvious one is literal matching, so checking whether the value of the file to be scanned is exactly like a given value. But it's also possible to use wild cards or regular expressions, to create more complex strings to match against. This applies to binary strings as well by the way.

One way we apply machine learning, for example, is by automatically suggesting our analysts flags and fields that are high-quality candidates for an actual signature, depending on which samples they are currently working on. You can see those red pins in front of some of the signature flags, which indicate attributes that are anomalies and therefore likely flags that would make a good signature.

But we aren't limited to just these flags. Signatures can also be made up or contain more complex patterns:

210182


You can simply highlight the areas of the file that should be used for detection and how to locate that area. Whether it should be relative to certain points of interest for example. Patterns can have ranges. So even if they move around in the file, they still can be found. Obviously doing those by hand is a bit tedious. So you can also, once again using machine learning techniques, let the tool figure out good candidates for you:

210181


This one, for example, parses all the functions inside the code of the file and extracts the code blocks and fragments that are most unique and don't appear in other good files. But it also works for normal strings:

210183


At the very end of all of this, whether you decided to create the signature manually or let all the machine learning stuff help you, you end up with a small function in our own domain-specific programming language that is used by our scan engine:

210184


This function will then be compiled into native machine code. The code of hundreds of thousands of these signatures is then combined into signature files that are being shipped to our users.

This is just a very small portion of what Signature Maker can do, but it outlines roughly how we would go about adding detection of a new malicious file. Ultimately there are a whole bunch of additional features, especially for clustering vast amounts of samples to find all the samples that are related to each other for example, so we can extract a single signature that matches all of them (often tens of thousands of variations).

It also signifies something, that I don't think a lot of people realise: For a lot of AVs, there is no difference between the engine and the signatures. In many cases, the "engine" is just a loader or a virtual machine, that loads and executes the actual logic and functionality that is part of the signature files. I only showed you a very small amount of what we can do, but in general, it can get a lot crazier and "signatures", which are really just normal code running on your system, can end up being entire algorithms and perform complex operations (for unpacking for example) and can interact with the entire Windows API.

I hope that little excursion was interesting. :)
 

Fabian Wosar

From Emsisoft
Thread author
Verified
Developer
Well-known
Jun 29, 2014
260
Does your dev team use git? I know the basic commands(add, commit, push, branch, status etc.) wanna know if it will help me.
Yes. We do use Git. Some very old projects also use Subversion. But honestly, there is absolutely no point in learning any other version control system than Git.
 

Nightwalker

Level 24
Verified
Honorary Member
Top Poster
Content Creator
Well-known
May 26, 2014
1,339
There is no AI AV out there. All of them use human-assisted machine learning, which by its very definition is not AI. Machine learning has been in use by pretty much every single major AV (what they call "legacy") in one form or another for literally over a decade. The oldest and most prominent one being systems that automatically recognise malware families, clustering samples and extracting appropriate signatures from them.

For example, we use machine learning extensively inside the anti-malware network. Our signature generation tools also automatically suggest certain functions and code fragments that would make good signatures.

So all those fancy machine learning based technologies are available in one way or another to most traditional anti-viruses as well. Except that they, in addition, also have all those other technologies at their disposal: Emulation, behaviour monitoring, signature scanning, reputation-based anomaly detection etc.. You know, the ones that traditional AVs always had, making them, in general, more flexible.

Great post, I wrote something similar in a discussion on Wilders Security forum. For me the marketing of "Next Gen AV" is a insult for the real malware analysts/developers/security specialists out there.

How effective is Signatureless AVs like Panda Dome?
 

drakester

Level 1
May 14, 2017
11
Impressive amount of insight on signatures and how they are built, thank you. A lot of other vendors wouldn't disclose as much.
Emsisoft support is absolutely great, another kudos to them.

No questions from me, just some props, thanks for doing this and being so close to users and potential users.
 

bjm_

Level 14
Verified
Top Poster
Well-known
May 17, 2015
669
Don't get me wrong, I understand that companies, who do elaborate tests, need some way to pay the bills as well. They have employees who need to get paid for example. But I think different price tiers that buy you more frequent testing or the ability to withhold test results (Matousec used to do that) or the ability to buy the performance data of the other participating products, for example, goes a little bit too far.
Malwarebytes Labs, November 27, 2018 article Why Malwarebytes decided to participate in AV testing.
We still do not believe in the “pay-to-play” model, and especially the “pay-to-see-what-you-missed” model that some organizations use. (AV companies, for an additional fee, can see the samples they did not catch in the test and develop fixes in the product for future tests/use.) Nonetheless, we want to give our customers some idea of what we are capable of, even when the playing field is skewed.
 

oldschool

Level 82
Verified
Top Poster
Well-known
Mar 29, 2018
7,107
Thanks for a great post!

I agree whole heartedly and thats one of things I like most about Emsisoft. IMHO (and I am not just saying this because you are here), I honestly believe that Emsisoft has probably one of the best, if not THE best customer service available out of all the security companies around. (y)

I hope that little excursion was interesting. :)

I'm a fairly new student of Windows and the world of security softs. I don't pay a yearly subscription for anything currently, but have purchased a few programs. This is to say if I were to buy a yearly/multi-year subscription for an AV, it would be yours based on the above. Wow, a company with some ethics, what a delight! (y)
 
Last edited:
F

ForgottenSeer 72227

Talking about signature tools. I almost forgot. This is, for example, one of the tools we developed internally. It's called "Signature Maker". It's a clever name, I know. It's kind of like an IDE, except for creating detection signatures for our scan engine:

View attachment 210180

In general signatures for the Emsisoft engine are essentially functions that are being called by the scan engine depending on certain filter flags, like the file type for example. The signature flags you can see on the right are pretty much functions that perform certain tests. We can match signatures against certain version information fields for example or based on specific PE header fields. Things like imported APIs or exported APIs. But also more advanced information. Programming languages like .NET or Delphi, for example, leave a bunch of meta information behind, that our scan engine is capable of parsing and use as flags and information to feed into the actual detection functions (which is what signatures for our engine actually are).

Fields can be matched using a variety of methods. The most obvious one is literal matching, so checking whether the value of the file to be scanned is exactly like a given value. But it's also possible to use wild cards or regular expressions, to create more complex strings to match against. This applies to binary strings as well by the way.

One way we apply machine learning, for example, is by automatically suggesting our analysts flags and fields that are high-quality candidates for an actual signature, depending on which samples they are currently working on. You can see those red pins in front of some of the signature flags, which indicate attributes that are anomalies and therefore likely flags that would make a good signature.

But we aren't limited to just these flags. Signatures can also be made up or contain more complex patterns:

View attachment 210182

You can simply highlight the areas of the file that should be used for detection and how to locate that area. Whether it should be relative to certain points of interest for example. Patterns can have ranges. So even if they move around in the file, they still can be found. Obviously doing those by hand is a bit tedious. So you can also, once again using machine learning techniques, let the tool figure out good candidates for you:

View attachment 210181

This one, for example, parses all the functions inside the code of the file and extracts the code blocks and fragments that are most unique and don't appear in other good files. But it also works for normal strings:

View attachment 210183

At the very end of all of this, whether you decided to create the signature manually or let all the machine learning stuff help you, you end up with a small function in our own domain-specific programming language that is used by our scan engine:

View attachment 210184

This function will then be compiled into native machine code. The code of hundreds of thousands of these signatures is then combined into signature files that are being shipped to our users.

This is just a very small portion of what Signature Maker can do, but it outlines roughly how we would go about adding detection of a new malicious file. Ultimately there are a whole bunch of additional features, especially for clustering vast amounts of samples to find all the samples that are related to each other for example, so we can extract a single signature that matches all of them (often tens of thousands of variations).

It also signifies something, that I don't think a lot of people realise: For a lot of AVs, there is no difference between the engine and the signatures. In many cases, the "engine" is just a loader or a virtual machine, that loads and executes the actual logic and functionality that is part of the signature files. I only showed you a very small amount of what we can do, but in general, it can get a lot crazier and "signatures", which are really just normal code running on your system, can end up being entire algorithms and perform complex operations (for unpacking for example) and can interact with the entire Windows API.

I hope that little excursion was interesting. :)


Another fantastic post @Fabian Wosar!

I really appreciate (and I am sure many people here do as well) you taking the time to do this. It's really interesting to see what makes Emsisoft tick and it just goes to show what a great company Emsisoft is truly. While I know you can't tell us absolutely everything, it's just great to see little bits of what goes behind the scenes. As I've said previously I am very eager and excited to see the new upcoming changes/improvements and how they will work.

I know in a previous post you mentioned that with the upcoming changes you will be able to provide signatures in real-time, does this mean that you will have some form of Machine Learning along side people creating the sigs? Also am I safe to assume that if an Emsisoft user comes in contact with a new piece of malware, all other Emsisoft users will be protected as well, due to the fact that the signature was created in real-time, similar to what some of your competitors are doing?
 

goodjohnjr

Level 5
Verified
Jul 11, 2018
230
We may. The problem is, that ultimately with these free AVs you as a user pay with your data. That's generally speaking something we don't feel very comfortable with. Especially given that not a lot of people are even aware of it.

Recently I was kind of surprised to see that an otherwise super privacy conscious user had Traffic Light installed for example. It doesn't seem to be common knowledge that Traffic Light and a bunch of other browser extensions (Comodo Online Security Pro, Norton Safe Web, Avira Browser Safety, Avast Online Security being the biggest ones) like it will literally send every single URL you visit in clear text off to the vendor's server. The privacy policies aren't always clear and kinda sketchy at times. I am sure that some people don't mind. But I am also sure that a lot of people do mind, but simply don't know.

Hello Fabian Wosar,

Thank you for this Q & A, and for informing us that some security browser extensions are sending the URLs we visit over clear text instead of using something like SSL or whatever because I did not realize that even some (most) of those security extensions by major companies were doing that.

I am currently using the Emsisoft Browser Security and Windows Defender Browser Protection (WDBP) extensions, and I was wondering if the WDBP and Malwarebytes extensions are guilty of sending the URLs unencrypted as well and can you name any other extensions that you know of that are guilty of this?

That will really help me / some of us know which extensions to avoid for those of us worried about privacy / security.

Thank you,
-John Jr
 

Vasudev

Level 33
Verified
Nov 8, 2014
2,230
We may. The problem is, that ultimately with these free AVs you as a user pay with your data. That's generally speaking something we don't feel very comfortable with. Especially given that not a lot of people are even aware of it.

Recently I was kind of surprised to see that an otherwise super privacy conscious user had Traffic Light installed for example. It doesn't seem to be common knowledge that Traffic Light and a bunch of other browser extensions (Comodo Online Security Pro, Norton Safe Web, Avira Browser Safety, Avast Online Security being the biggest ones) like it will literally send every single URL you visit in clear text off to the vendor's server. The privacy policies aren't always clear and kinda sketchy at times. I am sure that some people don't mind. But I am also sure that a lot of people do mind, but simply don't know.
We have bigger heads that lives off Telemetry or advanced diagnostics data's ahem Win 10. So, EAM free will be a small tail that uses minimal cost to and fro data in exchange for better security for the un-paid user. :)
I use BD TL and for the most part, it isn't invasive. Tried a few and they become highly intrusive and bloated after updates. I recently threw out BD TS which was blocking windows updates, driver updates and everything else until I uninstalled it.
 
  • Like
Reactions: show-Zi
F

ForgottenSeer 72227

We have bigger heads that lives off Telemetry or advanced diagnostics data's ahem Win 10. So, EAM free will be a small tail that uses minimal cost to and fro data in exchange for better security for the un-paid user. :)

It's a fair point for sure, but personally Emsisoft's stance on privacy is one of the many reasons why I like the company. Sure they can go down the route like every one else when it comes to free AV/AM, but just because everyone else does it, doesn't mean they should. It's one of the many things that makes them different than the rest. Sure it may be "priced" higher, but I am willing to pay it knowing I am going to get a great product that has excellent protection, amazing customer service and excellent privacy. Things in life shouldn't always be about getting everything for free. If you are unable to purchase the product for what ever reason, it's totally cool, there are other great options out there to choose from and heck if your using W10 you technically already have a free AV/AM. I use to really like Avast, but I can no longer stand them due to the way they treat their customers and how they data mine them. I don't care how good it is protection wise, their data mining has turned me off completely. I really would not like to see Emsisoft go down this route at all. Keep it a paid product and if you aren't able to purchase it, wait for either a potential giveaway, or a sale.:)
 
Last edited by a moderator:

Fabian Wosar

From Emsisoft
Thread author
Verified
Developer
Well-known
Jun 29, 2014
260
Great post, I wrote something similar in a discussion on Wilders Security forum. For me the marketing of "Next Gen AV" is a insult for the real malware analysts/developers/security specialists out there.
Some of them do pretty good work. But especially once they try to tell you, that all "legacy AVs" do are signatures, they are blatantly lieing to your face.

No questions from me, just some props, thanks for doing this and being so close to users and potential users.
Thanks. :)

Difficult question and touchy subject here. Is a 3rd party firewall necessary? And if so, when? (Assuming Windows 10 OS).
Well, we discontinued our firewall precisely because we don't see much benefit compared to the firewall in Windows 7 even. The biggest issue with the Windows firewall is tamper protection. Meaning: Everything running on your system can create rules and allow itself. EAM actually blocks that. So only applications you allow can interact with the Windows firewall.

Malwarebytes Labs, November 27, 2018 article Why Malwarebytes decided to participate in AV testing.
It's funny because for 2019 we decided to drop out of AV Comparatives.

What's wrong with subversion?
Their slogan used to be: "CVS done right." There is no way you can do CVS right, hence why it was doomed to failuget-gom the get go.

I'm a fairly new student of Windows and the world of security softs. I don't pay a yearly subscription for anything currently, but have purchased a few programs. This is to say if I were to buy a yearly/multi-year subscription for an AV, it would be yours based on the above. Wow, a company with some ethics, what a delight! (y)
Thanks. :)

I used to use Emsisoft. I've always enjoyed reading Fabian's posts.
Honest question: Why did you stop? :)

I wish this Q& A would never end! Do you have any old projects from when you started coding that you are willing to share(c64, zx spectrum, dos, win 3.1 etc.)?
I unfortunately no longer do. But my first stuff were small anti-virus tools that detected one specific virus and cleaned infected files. I then quickly moved to heuristic stuff, because I thought it was stupid to create new signatures and detections for every new virus. Back then there were literally only like a hundred or so of them in the first place though.

I know in a previous post you mentioned that with the upcoming changes you will be able to provide signatures in real-time, does this mean that you will have some form of Machine Learning along side people creating the sigs?
As I showed in the tool we use, we already do that. There is no way we can keep up with the number of samples we get otherwise. We obtain more than 450.000 new malicious files every single day. What I showed you there was pretty much the "manual" mode.

Also am I safe to assume that if an Emsisoft user comes in contact with a new piece of malware, all other Emsisoft users will be protected as well, due to the fact that the signature was created in real-time, similar to what some of your competitors are doing?
That's already the case for the behaviour blocker. If we see a malicious file on a single system and it is being picked up by the behaviour blocker there, automatic blocks are issued for all other users using EAM already.

Thank you for this Q & A, and for informing us that some security browser extensions are sending the URLs we visit over clear text instead of using something like SSL or whatever because I did not realize that even some (most) of those security extensions by major companies were doing that.
Ah, sorry. They do use SSL. But they send the entire URL to their servers, while most browsers or our extension for example, only send hashes and non-specific information that can't be turned back into URLs.

I am currently using the Emsisoft Browser Security and Windows Defender Browser Protection (WDBP) extensions, and I was wondering if the WDBP and Malwarebytes extensions are guilty of sending the URLs unencrypted as well and can you name any other extensions that you know of that are guilty of this?
Both are fine.

I use BD TL and for the most part, it isn't invasive. Tried a few and they become highly intrusive and bloated after updates. I recently threw out BD TS which was blocking windows updates, driver updates and everything else until I uninstalled it.
Windows doesn't get a list of all the websites you look at. Unlike Traffic Light:

210195
 

KevinYu0504

Level 5
Verified
Well-known
Mar 10, 2017
227
I got a question too ,
I already saw you explanation that Avira doesn't want partner with Emsi at first ,
but why Emsisoft final choose Birdefender to be a partner ? any special reason ?

I am from Asia , Taiwan ,
Bitdefender do not have any branch office or server in Asia ,
so the detection rate and reaction speed always behind than others , such as Kaspersky , ESET , Norton .....

Is there any chance Emsisoft will use Kaspersky's data base in the future ?
 

Fabian Wosar

From Emsisoft
Thread author
Verified
Developer
Well-known
Jun 29, 2014
260
I already saw you explanation that Avira doesn't want partner with Emsi at first ,
but why Emsisoft final choose Birdefender to be a partner ? any special reason ?
Mostly detection to false positive ratio combined with relative affordability. We also approached ESET, but they didn't have an OEM program back then and they still don't to my knowledge, as well as Kaspersky, who just stopped doing OEM deals back then.

Is there any chance Emsisoft will use Kaspersky's data base in the future ?
Never say never, but it is unlikely.
 
F

ForgottenSeer 72227

As I showed in the tool we use, we already do that. There is no way we can keep up with the number of samples we get otherwise. We obtain more than 450.000 new malicious files every single day. What I showed you there was pretty much the "manual" mode.


That's already the case for the behaviour blocker. If we see a malicious file on a single system and it is being picked up by the behaviour blocker there, automatic blocks are issued for all other users using EAM already.

Thanks for the info and that's great to hear!

Thanks for taking the time to make it clear. Sorry if I am making you repeat yourself, I am still learning, so sometimes understanding how things are done does take take a few goes at it.;)
 
Status
Not open for further replies.

About us

  • MalwareTips is a community-driven platform providing the latest information and resources on malware and cyber threats. Our team of experienced professionals and passionate volunteers work to keep the internet safe and secure. We provide accurate, up-to-date information and strive to build a strong and supportive community dedicated to cybersecurity.

User Menu

Follow us

Follow us on Facebook or Twitter to know first about the latest cybersecurity incidents and malware threats.

Top