Forums
New posts
Search forums
News
Security News
Technology News
Giveaways
Giveaways, Promotions and Contests
Discounts & Deals
Reviews
Users Reviews
Video Reviews
Support
Windows Malware Removal Help & Support
Mac Malware Removal Help & Support
Mobile Malware Removal Help & Support
Blog
Log in
Register
What's new
Search
Search titles only
By:
Search titles only
By:
Reply to thread
Menu
Install the app
Install
JavaScript is disabled. For a better experience, please enable JavaScript in your browser before proceeding.
You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an
alternative browser
.
Forums
Security
General Security Discussions
Any real-time software that uses non-traditional ways to find malware?
Message
<blockquote data-quote="WiseVector" data-source="post: 916387" data-attributes="member: 76851"><p>Sorry for the late reply.</p><p></p><p>I think the most important things in Machine Learning are how deeply you can parse a file, the train set you selected, the features you extracted.</p><p>Algorithms and ideas are secondary.</p><p></p><p>Take PE files for example, there are so many compilers(VB, .Net, Delphi, VC), packers(UPX, VMP, ASPACK) and installers(NSIS, SFX, Inno). The ML model accuracy depends on how deeply you parse these files. On the other hand, it is fundamentally impossible for machine learning to avoid FPs. Suppose we have two files. One call UrlDownloadFile to download a file from microsoft and the execute it. The other one will download malware from a malicious website and execute it. The pseudocode is:</p><p></p><p>File one:</p><p>UrlDownloadToFile (hxxps://www.microsoft.com/xx.exe, good.exe)</p><p>shellexecute(good.exe)</p><p></p><p>File two:</p><p>UrlDownloadToFile (hxxps://www.xxx.com/xx.exe, good.exe)</p><p>shellexecute(good.exe)</p><p></p><p>As you can see, there are minor differences between the two files. If you can parse the file deeply enough, the AI will eventually realize that file two has a bigger threat level than file one. But if you do that you will find it will have a bigger performance impact. So that's why ML engines often have more FPs than signatures based engines.</p><p></p><p>We always keep improving the ability to parse a file to reduce FPs. WV is nearly three years old and during this time we have received a number of FP files from users. These files are great for us to reduce FPs. If you can parse a file very well and have a good data set, you can do anything you want. For example, identifying malware by training legit files, or identifying legit files by training malware.</p><p></p><p>We have come to realize that AI based static scanning has too many limitations. So we spent a lot of time to develop AI based events analysis and AI based memory scanning. Finally malware will perform its malicious behavior or decrypt its payload in memory.</p></blockquote><p></p>
[QUOTE="WiseVector, post: 916387, member: 76851"] Sorry for the late reply. I think the most important things in Machine Learning are how deeply you can parse a file, the train set you selected, the features you extracted. Algorithms and ideas are secondary. Take PE files for example, there are so many compilers(VB, .Net, Delphi, VC), packers(UPX, VMP, ASPACK) and installers(NSIS, SFX, Inno). The ML model accuracy depends on how deeply you parse these files. On the other hand, it is fundamentally impossible for machine learning to avoid FPs. Suppose we have two files. One call UrlDownloadFile to download a file from microsoft and the execute it. The other one will download malware from a malicious website and execute it. The pseudocode is: File one: UrlDownloadToFile (hxxps://www.microsoft.com/xx.exe, good.exe) shellexecute(good.exe) File two: UrlDownloadToFile (hxxps://www.xxx.com/xx.exe, good.exe) shellexecute(good.exe) As you can see, there are minor differences between the two files. If you can parse the file deeply enough, the AI will eventually realize that file two has a bigger threat level than file one. But if you do that you will find it will have a bigger performance impact. So that's why ML engines often have more FPs than signatures based engines. We always keep improving the ability to parse a file to reduce FPs. WV is nearly three years old and during this time we have received a number of FP files from users. These files are great for us to reduce FPs. If you can parse a file very well and have a good data set, you can do anything you want. For example, identifying malware by training legit files, or identifying legit files by training malware. We have come to realize that AI based static scanning has too many limitations. So we spent a lot of time to develop AI based events analysis and AI based memory scanning. Finally malware will perform its malicious behavior or decrypt its payload in memory. [/QUOTE]
Insert quotes…
Verification
Post reply
Top