Serious Discussion Pre-execution vs post-execution protections explained

Trident · Aug 18, 2024

This post will be long and highly-technical.

We'll review the differences, pros and cons of pre-execution and post-execution protection.
Additional links may be provided, it is recommended to read all linked resources for better understanding.

Commenting on anything else, other than protection methods (for example specific security solutions) is not allowed and moderators will be urged to remove content as off-topic.

Chapter I: Pre-execution protection
Pre-execution protection is a group of technologies that attempt to block a threat, very early in its lifecycle.
This includes:

web-filter: blocks threats from even being downloaded onto the user's machine.
definitions: these could be human generated and/or machine-generated, or at the very least, machine-validated. Definitions include malware fragments. For a definition to be pushed, it is required that the vendor first obtains a copy of the malware. Automated systems can then process the file, usually by running it in a sandbox and creates a definition. This will usually be automatically quality-tested and if no false positives are generated on a large list of approved files, then definition will be pushed to the users. Recommended resources: Avast Evo-Gen; Eset ThreatSense Note: Other vendors are using automated definition creation methods too but this may not be documented. It is impossible to process all threats manually.
Generic detections: after many variants of a specific threat have been observed, vendors would create and test a generic detection. These block many variants of malware families.
Reputation: reputation may be based on age/prevalence (how many users encountered this file and when did it first happen), as well as it can be based simply on deny/allow-lists.
Static analysis, or the so-called NGAV: an operating system provides users with tools and resources they need to use their devices, from boot to shutdown. It also provides high number of built-in APIs and functions that can be used by developers. Additional frameworks are also available (examples: .net framework, .net Core, Boost Framework, etc). Static analysis looks at the imported APIs and functions, in an attempt to construct a clear picture how the file might behave. Example: program imports the <wincrypt.h> meaning it is capable of performing cryptographic functions. It then calls the CryptGenKey() funtion, meaning that it can generate encryption keys. This program can be ransomware or useful application. However, it does not call the CreateWindowEx() function, meaning it has no visible window. Static Analysis can then further examine parameters such as digital signature. If file is not properly signed, it is fair to assume that this is ransomware. Of course, this is heavily over-simplified, for more information, I like this blog post
Dynamic analysis - this entails running portions or the whole file in question in secluded virtual space, looking for specific executional patterns. Bitdefender. It is important not to confuse Dynamic Analysis as part of the scan engine with dynamic analysis as part of cloud emulation/detonation, as these are not the same technologies.
Emulation in sandbox: this rather unusual method included in solutions such as Palo Alto WildFire, Bitdefender GravityZone Elite, CrowdStrike Falcon Sensor, Check Point Threat Emulation delivers thoroughly-inspected files to users. The inspection usually takes few minutes but results in majority of malware being blocked without even reaching machines. Additional care is usually taken to sort evasions out but some still remain. My favourite reads: Palo Alto 1; Palo Alto 2; Check Point Evasions & Anti-Debug Tricks

Chapter II: Post-execution
Post-execution prevention includes a group of technologies aimed to serve as a last line of defence when other modules have failed. These technologies aid remediation and reduce the potential damage, and include:

Behavioural analysis, behavioural blocking and policy enforcement: As the system is operating, executing various frameworks, APIs and functions, passing parameters between them, these engines use user-mode hooks and kernel driver to obtain information of:
- New objects, which includes:
  - files
  - registry entries
  - services
  - drivers
  - mutexes
- Low-level memory operations such as:
  - named pipes
  - random pipes
  - relationships with loaded modules
- Network-related characteristics such as accessed URLs and ports, protocols used.
- Various low-level events

Behavioural analysis, blocking and policy enforcement performs the following operations, after sensors have captured information:

For performance and false positives reduction reasons, objects reputation will be looked up.
For the same reasons above, certain events, APIs, functions and behaviours might automatically be dropped, under the belief that they are not related to threats, or their monitoring compromises performance, and/or generates unneeded noise.
Execution sequences are built and updated, and are re-iteratively fed to a classification system. The classification system, be it local or an online one, returns a safe/malicious verdict. This system would usually be based on ML, more specifically on some sort of decision trees.
Once malicious verdict has been returned, behavioural blocking uses the execution sequences previously built to remediate the attack, or more simply said, to reverse the attack consequences.
Policy enforcement might not wait for sequences, but may outright block certain actions, which look like they deviate from the safe execution patterns. Example: VLC player has no logical reason to suddenly drop files in C:\Windows\SysWOW64

My favourite reads on behavioural blocking:
Kaspersky
Symantec STAR
Bitdefender ATD

Containment: this attempts to reduce the potential damage whilst malware is running and analysis systems such as behavioural blocking are "dwelling" on the sample. Examples include preventing access to certain resources (passwords, cookies, files of choice), blocking access to the internet and other methods. The analysed file might be partially sandboxed (i.e just some light restrictions applied) or fully sandboxed (the sample operates in isolated memory space and access to resources is highly restricted). Optionally, it is possible to create virtual machines and run malware there, manually. Malware does check a lot of artefacts for the presence of VM (which should have been read on the Palo Alto and Check Point blogs by now). Any VM can be scanned for resistance with the Pafish project or Check Point Invizzzible. Examples of containment include: Symantec SONAR automatically sandboxes applications monitored. McAfee/Trellix Dynamic Application Containment defines behavioural rules. Comodo, based on automated factors and/or user input, uses API virtualisation and restriction. Kaspersky IDS uses reputation and pre-defined rules to automatically restrict objects, based on age, prevalence and developer asssesment.
Prompts and alerts - these may include anything, from prompt whether or not application is allowed to create a startup item, to silent detections reflected on a management portal. Prompts and alerts, although frequently aiming to provide additional intelligence and context, are designed for knowledgable users who understand operating systems, malware and anti-malware principles very well. Hence, they are more typically deployed on enterprise environments.

Chapter III: How machine learning works
This is a bonus chapter, as we've mentioned AI/ML in previous chapters, here we'll get a brief overview, without going too deep on it.
Whilst there are many machine learning types and every task would probably require its own, there are few methods that are being mentioned again and again:

Decision trees, which could include random forest. Optionally, they could be gradient/XG-boosted, which essentially involves splitting one powerful model (decision tree) across a multitude of micro-models (weak learners), which by themselves are nothing more than a random guess. Based on how many times a specific feature is observed in malicious and clean files, this feature would have "weight". All features and weights are processed, often after multiple models are ran and the combined score, more often than not in percentage, represents the probability that the file is malicious.

Below is a small part of decision tree for Emotet malware, as displayed by Microsoft. The coloured squares are nodes (certain features or behavioural characteristics) and next to the chain combining them (specific relationship observed), is this relationship weight.

How artificial intelligence stopped an Emotet outbreak | Microsoft Security Blog

At 12:46 a.m. local time on February 3, a Windows 7 Pro customer in North Carolina became the first would-be victim of a new malware attack campaign for Trojan:Win32/Emotet. In the next 30 minutes, the campaign tried to attack over a thousand potential victims, all of whom were instantly and...

www.microsoft.com

Baysesian engines, bayes theory - this represents a more simplified approach, based on probability theory. It is a bit more linear than decision trees, where complex relationships are analysed. The best way to explain it would be to look at one specific part Sophos InteliX.

Neural networks, deep learning - Now here we are departing from the premises of simple machine learning and we are entering the AI field. These AI models allow human-like decisions to be taken by computer systems. Deep learning entails the usage of multi-layered (must be above 3 layers, could include hundreds layers as well) of neural networks.

What is a Neural Network? | IBM

Neural networks allow programs to recognize patterns and solve common problems in artificial intelligence, machine learning and deep learning.

www.ibm.com

What Is Deep Learning? | IBM

Deep learning is a subset of machine learning that uses multilayered neural networks, to simulate the complex decision-making power of the human brain.

www.ibm.com

Machine learning is applied in static analysis, dynamic analysis and in behavioural blocking. The only difference is, static analysis extracts features and imports, dynamic analysis and behavioural blocking extract behavioural (executional) sequences.

Final: Questions answered.

What's more important, pre-execution or post-execution analysis?
When dealing with malware, the earlier a threat is blocked, the better!
That being said, attackers have found ways to evade both, so it is important to use a layered solution that combines both, which many products typically are.

How attackers evade pre-execution analysis?
All of the pre-execution techniques are generally limited from the point of view that they run locally, on-device. The anti-malware engine doesn't have all day to process and analyse, it's got milliseconds to complete all analysis, produce a verdict and release file for usage or delete it. This requires heavy optimisations in scanning engines and allows attackers to wiggle their way out of analysis.
For example, dynamic analysis where portions are executed, might stumble across a repetitive loop implemented by attackers on purpose. Because dynamic analysis will have a total instructions limit, this pointless loop will help limit be reached quicker. Less of the real behaviour in term will be observed.

Are there any techniques that are hybrid, i.e. combination of local and cloud?
Yes, for example static analysis may frequently perform the so called feature extraction (bagging) locally, but the decision trees may later on be processed on a server.

What additional challenges pre-execution analysis faces?
Obfuscation. packaging and the need for false positives control are the most major ones.

How is post-execution analysis evaded?
Although generally much more difficult to evade due to its generic nature, performance optimisations such as whitelisting specific processes and APIs, may create a blind spot.More sophisticated attackers, usually through trial and error would discover blind spots.

I opened a file and my antivirus behavioural blocker kicked in immediately. Am I safe?
It cannot be guaranteed. Different vendors and even different blocks can rely either on very short execution sequences or on very long (which reduces FPs). Prior to the malicious label being stuck on the process, it has already performed some operations. Without a proper forensics analysis, it cannot be concluded that the system is not in any way compromised.

What does a compromise include?
Whilst there is a general belief that just data loss (also known as exfiltration) and data destruction, and or, data encryption for impact represent a compromise, this belief is false.
The Mitre ATT&CK framework lists various tactics and procedures and each one of them represents potential compromise.
For example, although data may not have been encrypted or exfiltrated, attackers may have received a list of software installed, which would allow them to start looking for vulnerabilities, or they may have received the host's IP address, which would allow them to employ an arsenal of network attack tools. Even without network access (for example firewall blocked communication), it is possible to compromise the system by changing settings that user may not notice immediately -- such as disabling Windows Update, lowering UAC levels, impairing browser update mechanisms and more.

The treat was contained/sandboxed. Does that make things better?
Depends on settings, but in general yes. In some cases, malware may invoke programmes. The program data enters the container and is ready to be exfiltrated. Containment works best when coupled with internet disconnection through firewall. It's important to note that running malware, even contained, is not recommended unless certain hardware safety measures have been put in place.

Why cloud emulation is necessary?
Cloud emulation such as the one offered by the products mentioned in the beginning of the post allows the file to be taken from the user and examined in a much heavier way, which locally on the user device, simply will not be possible. The benefits include blocking malware super early, before execution or even before download and high detection rates. Downside is, users will have to wait for emulation to complete analysis, which usually takes a few minutes.

Anymore questions? Feel free to ask.
This concludes the topic for now.

Bot · Aug 18, 2024

Thank you for this comprehensive and detailed explanation of pre-execution and post-execution protections. Your post provides a clear understanding of the differences, benefits, and challenges of both methods. It's important to note that a layered security approach, combining both pre and post-execution analysis, is often the most effective.

eonline · Aug 19, 2024

I think it is quite old to talk or explain malware in this era. The people with the knowledge and instruments can gain control of any machine without infect de machine before. Prevention and detection of malware is important, yes. But right now is not the central point of a serious discussion about cybersecurity.

=)

Jonny Quest · Aug 19, 2024

eonline said:
I think it is quite old to talk or explain malware in this era. The people with the knowledge and instruments can gain control of any machine without infect de machine before. Prevention and detection of malware is important, yes. But right now is not the central point of a serious discussion about malware or infected machines or similar.

=)

It was a knowledge based, informational article ("explained"), not a troubleshooting post (malware infected device)? I found it very informative

Trident · Aug 19, 2024

eonline said:
I think it is quite old to talk or explain malware in this era. The people with the knowledge and instruments can gain control of any machine without infect de machine before. Prevention and detection of malware is important, yes. But right now is not the central point of a serious discussion about cybersecurity.

=)

Contrary to your belief, majority of the attacks are performed through malware and phishing. Almost all.
However this post is just informational for people to peek what’s in their AV.

eonline · Aug 19, 2024

Trident said:
Contrary to your belief, majority of the attacks are performed through malware and phishing. Almost all.

Zero-click attack this the point in cybersecurity of the financial institutions, governments, and everyone. And zero-click attack is what happens now in the cybersecurity scenario.

Trident · Aug 19, 2024

eonline said:
Zero-click attack this the point in cybersecurity of the financial institutions, governments, and everyone. And zero-click attack is what happens now in the cybersecurity scenario.

Zero click attacks can still be counted on the fingers of your two hands though. They are used against high profile targets.
This has been reported by Kaspersky as well.
Out of tens of attacks reported every month and many more not reported.

According to the Deloitte analysis, 91% of attacks start with Phishing.

Beware of phishing emails
According to reports, 91% of all attacks begin with a phishing email to an unsuspecting victim. On top of that, 32% of all successful breaches involve the use of phishing techniques. Despite extensive attempts in the media and corporate security programmes over many years to educate users on the dangers of, and methods to spot phishing emails, these attacks remain highly successful. It is advisable to only open attachments when you are expecting them and know what they contain, even if you know the sender.

Search

Serious Discussion Pre-execution vs post-execution protections explained

Was the content easy to understand and absorb?

Yes

No

Trident

Level 34

How artificial intelligence stopped an Emotet outbreak | Microsoft Security Blog

What is a Neural Network? | IBM

What Is Deep Learning? | IBM

Bot

AI-powered Bot

eonline

Level 21

Jonny Quest

Level 21

Trident

Level 34

eonline

Level 21

Trident

Level 34

Similar threads