Machine Learning for Cyber Security – Static Detection of Malicious PE Files

Andy Ful · Feb 13, 2021

This article is from the year 2019 but still worth recalling. It contains some well known information about factors that are important to static malware detection. Here are some interesting fragments:

"PE Imports
A PE can import code from other PEs. To do so, it specifies the PE file name and the functions to import. It is important to analyze the imports to get a coherent image of what the PE is doing. Some of the imported functions are indicative of potential malicious operations such as crypto APIs used for unpacking/encryption or APIs used for anti-debugging.Some example of potential malicious imports:

Import Names	Potential Malicious Usage
KERNEL32.DLL!MapViewOfFile	Code Injection
KERNEL32.DLL!IsDebuggerPresent	Anti-Debugging
KERNEL32.DLL!GetThreadContext	Code Injection
KERNEL32.DLL!ReadProcessMemory	Code Injection
KERNEL32.DLL!ResumeThread	Code Injection
KERNEL32.DLL!ResumeThread	Code Injection
KERNEL32.DLL!WriteProcessMemory	Code Injection
KERNEL32.DLL!SetFileTime	Stealth
USER32.DLL!SetWindowsHookExW	API Hooking
KERNEL32.DLL!MapViewOfFile	Code Injection
ADVAPI32.DLL!CryptGenRandom	Encryption
ADVAPI32.DLL!CryptAcquireContextW	Encryption
KERNEL32.DLL!CreateToolhelp32Snapshot	Process Enumeration
ADVAPI32.DLL!OpenThreadToken	Token Manipulation
ADVAPI32.DLL!DuplicateTokenEx	Token Manipulation
CRYPT32.DLL!CertDuplicateCertificateContext	Encryption

All these features enable us to learn about the new PE before it is executed or loaded, and therefore before it affects the system.

...

From these results, we can conclude that the most useful feature for distinguishing between benign PE files and malicious PE files is the maximum entropy of all the PE section entropies. This observation fits with our assumptions that high entropy is not common with benign PE files. In addition, it seems that there is great importance to the signature status of the file. Namely, if the PE file is not signed or it is signed with an unverified signature there is a very high probability that it is a malicious PE file.

The next most important features are related to section names and permissions. Malware often uses packing techniques to avoid being detected by antivirus signatures. This results in nonstandard sections names and write permissions.

We also notice that the categories of the suspicious import had an impact on the model accuracy. In these features, we grouped different suspicious API functions by categories such as evasion, encryption, remote allocation etc. In each group, there can be several functions from different DLLs. This allowed us to learn the malicious activity without overfitting to specific functions."

Full article:
https://www.cyberbit.com/blog/endpo...learning-for-cyber-security-static-detection/

Search

Machine Learning for Cyber Security – Static Detection of Malicious PE Files

Andy Ful

From Hard_Configurator Tools

Similar threads