- Jun 21, 2014
- 1,044
Hello,
This post will cover the three main types of signature detections. The most common signatures are hashes, byte-signature and heuristics. This post is going to focus primarily on creating signatures for Microsoft Portable Executable (PE).
The reason for this post is because there is very little information on "creating anti-virus signatures", I see people that love computer security as much as I do. I will do my best, a few articles exist on how to create signatures with ClamAV. I'm hoping the information in this post will be helpful.
Packed code creates a dilemma for signature detection. If the files have been packed or compressed, the file will need to uncompressed or dumped before scanned. Anti-virus engines use emulators and unpackers to get the files to an uncompressed or dumped state before scanning the file. If the files are compressed or packed tools such as TitanEngine or Immunity Debugger could be used for creating dumps or uncompressed files.
Data that has been obfuscated or compressed should never be used as a candidate for a signature. As in the case of file hashes such as MD5; changing one byte of data can change the obfuscated or compressed code. Since the bytes can be easily changed by different data or key, there is a chance that the data will not be present in other variants.
It is a lot to say about reverse engineering and code analyzing, I will show some examples and there will be more clear.
Tools
The most basic and easiest type of signature is a hash value. A hash value is created by a hash function that is a procedure or mathematical function which converts a large amount of data into a single value. The most commonly used hash function is MD5 and SHA-1. These hash functions are extremely accurate.
We save this in myMalware.txt
Md5 based signatures can be created using ClamAV. Yara does not support file hashing. ClamAV requires two attributes in order to create a MD5 hash signature. The first is the file size in bytes and the second is the MD5 hash. ClamAV comes with a tool called sigtool that can be used to generate MD5 signatures. Sigtool can be found in the "bin" directory in the installation folder of ClamAV.
if you are using Windows you can Shift + Right Click and open Command line here
e4fee76675e45750b9e144247f92fd38 = md5
21 = size
myMalware.txt = Malware name
Now we will save e4fee76675e45750b9e144247f92fd38:21:Not-A-Virus@TestSignature in a file myFirstSig.hdb (make sure you have the signature file and the "malware" we created in the textfile in /bin).
Byte-Signatures
Byte-signature or byte detections are a signature based off a sequence of file bytes that are present in a file or data stream. Byte signatures are a very common form of detection and have been used since the first anti-virus scanner. Their usefulness is due to the accuracy they provide for detecting a sequence of bytes.
Heuristics
The last type of signature detections is heuristics. Heuristics is used when the malware is too complex for hash and byte-signatures. Heuristics is a general term for the different techniques used to detect malware by their behavior.
Each anti-virus engine uses different algorithms and different proprietary techniques. A simple example of creating a heuristics signature would include an API logger and rules based off the APIs.
Rule A
An API call to RtlMoveMemory with a string of "SOFTWARE\Classes\http\shell\open\..."
Rule B
An API call to CreateMutexA with a string of "Mlwr"
Rule C
An API call to GetSystemDirectory
And now it will check:
if ( Rule A && Rule B && Rule C )
then Process = Malware
This post is just an introductory to creating anti-virus signatures. It is a lot more, I tried to make it so everybody will understand. If someone wanna learn this is a good start, there are some good free tools. It is funny, you can make a signatures even if you don't know RE or ASM
A lot of Cloud AV uses: MD5 / SHA1 / SHA256 for quick detection. There is a website for this VirSign
BTW, ClamAV team is now accepting "Community Signatures" into the official database. More details here http://www.clamav.net/lang/en/2014/02/18/introducing-clamav-community-signatures/ if someone is interested
This post will cover the three main types of signature detections. The most common signatures are hashes, byte-signature and heuristics. This post is going to focus primarily on creating signatures for Microsoft Portable Executable (PE).
The reason for this post is because there is very little information on "creating anti-virus signatures", I see people that love computer security as much as I do. I will do my best, a few articles exist on how to create signatures with ClamAV. I'm hoping the information in this post will be helpful.
Packed code creates a dilemma for signature detection. If the files have been packed or compressed, the file will need to uncompressed or dumped before scanned. Anti-virus engines use emulators and unpackers to get the files to an uncompressed or dumped state before scanning the file. If the files are compressed or packed tools such as TitanEngine or Immunity Debugger could be used for creating dumps or uncompressed files.
Data that has been obfuscated or compressed should never be used as a candidate for a signature. As in the case of file hashes such as MD5; changing one byte of data can change the obfuscated or compressed code. Since the bytes can be easily changed by different data or key, there is a chance that the data will not be present in other variants.
It is a lot to say about reverse engineering and code analyzing, I will show some examples and there will be more clear.
Tools
- ClamAV - Hex Byte Scanning, regex, md5 file scanning, md5 sectional scanning, sigtool ( tool for creating signatures and hashes )
- Yara - A powerful rule based scanner that supports many conditions and data types, does not support hashing
- ssdeep - A tool for creating and comparing context triggered piecewise hash.
- Titan Engine - I don't have an epic description
The most basic and easiest type of signature is a hash value. A hash value is created by a hash function that is a procedure or mathematical function which converts a large amount of data into a single value. The most commonly used hash function is MD5 and SHA-1. These hash functions are extremely accurate.
Code:
md5("This is a bad malware") = "e4fee76675e45750b9e144247f92fd38"
We save this in myMalware.txt
Md5 based signatures can be created using ClamAV. Yara does not support file hashing. ClamAV requires two attributes in order to create a MD5 hash signature. The first is the file size in bytes and the second is the MD5 hash. ClamAV comes with a tool called sigtool that can be used to generate MD5 signatures. Sigtool can be found in the "bin" directory in the installation folder of ClamAV.
if you are using Windows you can Shift + Right Click and open Command line here
Code:
\bin>sigtool.exe --md5 myMalware.txt
you will have an output like : e4fee76675e45750b9e144247f92fd38:21:myMalware.txt
e4fee76675e45750b9e144247f92fd38 = md5
21 = size
myMalware.txt = Malware name
Now we will save e4fee76675e45750b9e144247f92fd38:21:Not-A-Virus@TestSignature in a file myFirstSig.hdb (make sure you have the signature file and the "malware" we created in the textfile in /bin).
Code:
Loading virus signature database, please wait... |
Loading virus signature database, please wait... done
D:\Malware Research\ClamWinPortable\App\clamwin\bin\myMalware.txt: Not-A-Virus@TestSignature.UNOFFICIAL FOUND
----------- SCAN SUMMARY -----------
Known viruses: 1
Engine version: 0.98.1
Scanned directories: 0
Scanned files: 1
Infected files: 1
Data scanned: 0.00 MB
Data read: 0.00 MB (ratio 0.00:1)
Time: 0.053 sec (0 m 0 s)
Byte-Signatures
Byte-signature or byte detections are a signature based off a sequence of file bytes that are present in a file or data stream. Byte signatures are a very common form of detection and have been used since the first anti-virus scanner. Their usefulness is due to the accuracy they provide for detecting a sequence of bytes.
Code:
Malware_Name:1 for PE and 0 for all files: hexadecimal representation of the opcodes
Heuristics
The last type of signature detections is heuristics. Heuristics is used when the malware is too complex for hash and byte-signatures. Heuristics is a general term for the different techniques used to detect malware by their behavior.
Each anti-virus engine uses different algorithms and different proprietary techniques. A simple example of creating a heuristics signature would include an API logger and rules based off the APIs.
Rule A
An API call to RtlMoveMemory with a string of "SOFTWARE\Classes\http\shell\open\..."
Rule B
An API call to CreateMutexA with a string of "Mlwr"
Rule C
An API call to GetSystemDirectory
And now it will check:
if ( Rule A && Rule B && Rule C )
then Process = Malware
This post is just an introductory to creating anti-virus signatures. It is a lot more, I tried to make it so everybody will understand. If someone wanna learn this is a good start, there are some good free tools. It is funny, you can make a signatures even if you don't know RE or ASM
A lot of Cloud AV uses: MD5 / SHA1 / SHA256 for quick detection. There is a website for this VirSign
BTW, ClamAV team is now accepting "Community Signatures" into the official database. More details here http://www.clamav.net/lang/en/2014/02/18/introducing-clamav-community-signatures/ if someone is interested