Malware Analysis #2 - What is Entropy and how do I find the Entropy of a file?

  • Thread starter Deleted member 21043
  • Start date
Status
Not open for further replies.
D

Deleted member 21043

Thread author
Hi everyone,

For those of you who read my previous, original post on the start of "How to Analyze malware", this is a continue to it. And, more will come in the future for other things: How to start analyzing malware (Guide)

Before I want to get too deep in anything else (which hopefuly I will within the coming days/weeks), I thought I would cover some basic things.

Entropy. So, what is Entropy?
Let's take a look at Wikipedia, quickly.
In computing, entropy is the randomness collected by an operating system or application for use in cryptography or other uses that require random data. This randomness is often collected from hardware sources, either pre-existing ones such as mouse movements or specially provided randomness generators.

What is the formula for Entropy?
d47ae464b7e0c4192361d6cc07fad604.png

The Shannon entropy is restricted to random variables taking discrete values. The corresponding formula for a continuous random variable with probability density functionf(x)with finite or infinite support
208f2b85d3349015a345e6eb1c614412.png
on the real line is defined by analogy, using the above form of the entropy as an expectation:

Source: Entropy (information theory) - Wikipedia, the free encyclopedia

How to get the Entropy of a file:
If you have experience in programming field, you could attempt to make your own utility to calculate the entropy based on the bytes in a file, however if not, you could use online services to accomplish the task.

Yesterday, whilst I was writing up the thread for reverse engineering .NET malware samples (work in progress, coming soon), I had to make an example application to show. So, I made a quick sample in .NET which deleted the hosts file, closed down all instances of Internet Explorer and added to start-up. The main reason I did this instead of a program to show a messagebox, is because if you want to read the thread about reverse engineering and don't have experience in programming, it may help you still identify malicious code in the future. For example, if you used the guide on it to reverse engineer some .NET malware, from learning from my thread you would be able to identify when a program is going to add itself to startup, delete a file (like hosts file) and so on.

So, I uploaded this to VirusTotal. I made it for educational purposes anyway, however, it was not obfuscated. The entropy in the file will appear to be low.

However, I then obfuscated the application with Confuser obfuscator, and the entropy is higher than the one which was un-obfuscated. See?

wj7RS.jpg


As we can see in the above image, the Entropy is closer to 8, whereas the normal application which has no encryption/obfuscation is shown below, further away from 8:

iDwGW.jpg


Unobfuscated/unencrypted version: https://www.virustotal.com/en/file/64ca59f3729f5d980daca6a4668c2beac93e5665cf7fd58011fc554ff3a22c46/analysis/1423913325/
Obfuscated/encrypted version: https://www.virustotal.com/en/file/...75f1af7d9bd396bd7a154aee/analysis/1423913201/

On VirusTotal, click "File detail" and go down to the PE sections area. There should be a column displaying Entropy. Check the .text for example.

You can do your own experimenting also. If you wish to have information on encrypting/obfuscation with .NET assemblies (or in general) I can always make a thread. Just remember, all of this is for educational use, good use. You can also try comparing a file entropy against the file in a ZIP.

Other information:
If the Entropy is high, then the file is probably packed. If you take a file and check it's entropy, it may be low. However, if you then put that file in a ZIP archieve and re-check the entropy of the ZIP file with the file inside the ZIP archieve, the entropy should be high.
Encrypted = higher entropy.

In shannon entropy, it goes to 8. The closer to 8 in terms of Entropy can represent a encrypted file. Whereas, a low entropy can suggest a un-encrypted file.

Entropy can come useful for many things, especially when it comes to Antivirus software. I assume, that other Antivirus vendors have some sort of "Entropy checking detection", due to it's uses. I think HitmanPro has this included in their engine, and you can also view the entropy of a file on VirusTotal.

I hope this thread taught you something. As always, if anyone spots anything which may be incorrect in this thread, please do let me know so I can change it, chances are there are a few mistakes here and there on one thread or another. Were all human. A majority of this thread evolved around information provided from the wikipedia source link. (the image and the quoted parts). Of course the research information about the file stuff I did etc isn't.

If an admin or staff member see's this, can they change the thread title too: "Malware Analysis #2 - Learn about Entropy and how you can find the Entropy of a file" as I did the same mistake like my last one, and it looks like a question. :D

Next guide: Malware Analysis #3 - Reverse engineering .NET malware
Cheers, I hope I helped. ;)
 
Last edited by a moderator:

Cch123

Level 7
Verified
May 6, 2014
335
Just to add on:

Purpose of Entropy values:
As mentioned, entropy values can give a rough estimation of whether the file in question is encrypted or not. This is important as it provides a rough guide as to what analysis methods to use and what to expect. For example, a file with low entropy values can mean an encrypted sample and you can use straightforward static analysis. A file with high entropy, on the other hand, may mean that the malware is packed and that dynamic analysis may be more useful if you do not wish to spend the time to unpack the file.

Furthermore, file entropy profile diagram can indicate "interesting" areas of the malware. For example, certain malwares are not encrypted entirely. A real life example will be the Gauss APT. In this case, only the payload was heavily encrypted. In certain scenarios such as incidence response in companies, investigators want to know what exactly was the attackers doing in order to perform damage control. Entropy would allow investigators to quickly identify and prioritise on these crucial areas for analysis.

Lastly, entropy can be used as a form of malware classification. Together with other identifiers (such as its API imports etc.), malware belonging to the same threat actors can be classified easily and to a high degree of accuracy. This is because the same hacker groups would usually use similar encryption routines and the same malware platform to launch their attacks.

Further usage (beyond malware analysis):
Entropy analysis can also be used to determine if a network connection is encrypted or not. Why is this important? To start off, it can be used to determine if the network connection is being attacked. For example, when encrypted SSH tunnels are used, the entropy values would be far higher than unencrypted channels. However, if the connection being attacked, entropy values would fall. This property would be exploited by some firewalls to detect an attack on the network it is protecting.

Also, entropy can be used to detect unwanted encrypted contents. For example, the police can use entropy analysis to determine if a suspect is hiding something on his computer. Using a technique called stenography, suspects can hide information and even whole texts in pictures. To the human eye, these digital images would look like any other image of no interest. However, the moment entropy analysis is conducted, it would be all over for the secret information. Stenographed files have a far higher entropy level than that of normal files.
 
Status
Not open for further replies.

About us

  • MalwareTips is a community-driven platform providing the latest information and resources on malware and cyber threats. Our team of experienced professionals and passionate volunteers work to keep the internet safe and secure. We provide accurate, up-to-date information and strive to build a strong and supportive community dedicated to cybersecurity.

User Menu

Follow us

Follow us on Facebook or Twitter to know first about the latest cybersecurity incidents and malware threats.

Top