Status
Not open for further replies.
D

Deleted member 21043

Hi everyone,

Previous article: Malware Analysis #6 - Understanding packers and detecting a packed file

What you will be aware of and learn in this tutorial:
- An understanding of what a byte is
- An understanding on binary/machine code and how compilers work
- An understanding of what HEX is
- An understanding about HEX editors and how to use them
- A brief overview on how an Antivirus company may use HEX for signature-based detection of malware

Part 1 - Understanding what a byte is
A byte is 8 binary digits (in other words, "bits"). A bit is essentially 1 binary digit. When you have 8 binary digits, this is a byte. (I may continue this and expand this).

Part 2 - What is binary, machine code, and how do compilers work?
Binary is a numerical base 2 (8 bit) system. In binary, there are 2 values: 0 and 1. 0 represents OFF, 1 represents ON. It is useful for things like electrical signals.

A quick grid of binary I have made is as follows, if you have trouble reading it, let me know and I shall do a picture:

64 | 32 | 16 | 8 | 4 | 2 | 1 | 0
-------------------------------------------------------------------------------------------------------------------------------------------------------

If we look above, we start at 0, then 1, and then the number doubles. E.g. 2 * 2 = 4.
To represent we want to use this value, assign ON to it, we must use 1. To represent OFF, we use 0. 3 would be:

00000110


Do you see what I mean? We put OFF for 0 as we do not want it, we then put 1 for ON on 1 and 2. 1 + 2 = 3 (2 + 1 = 3).
We then put 0 for the rest numbers to say we do not want them included, OFF signal for them.

What is machine language:
Machine language is basically binary with hexadecimal allowed.

How compilers work:
The job of a compiler is to "compile" (translate) the code into another language. E.g. executable code.

Source code >> Compiler >> Machine code - (as an example).

That's enough on this for now.

Part 3 - Understanding what HEX is
HEX is a 16-base number system. When using HEX, the values go from 0 to 9. After this, you cannot go higher to 10, 11, and 12 directly with those numbers. It is then A to F.

0 - 9
A - F

A = 10
B = 11
C = 12
D = 13
E = 14
F = 15

I hope that A - F value info above helped you understand it.

For every 2 HEX digits you represent 8 binary digits, which as we learnt earlier, is a byte.


Part 4 - Understanding how to use HEX editors

For the guide, I will be using HxD hex editor, which can be downloaded from the following source: HxD Download
Once downloaded and installed, open it up.

It should look like this:


This is the main GUI of HxD. To open a new window for HEX editing, you can go to File > New, or click the first document icon on the top bar under "File", "Edit", "Search", ... bar.

A document should look like this:


You can type in the right hand side and it will display the HEX according to the text on the left.



On the right hand side you have the ASCII dump:


And then of course the HEX on the left, as shown in the above examples. In the last part which is next in this guide, I will explain about how some Antivirus software might take advantage of HEX, how malware may try to avoid detection and a few signatures for any AV developer on here they may want to take a look at and be interested in.

Part 5 – More info on HEX and Antivirus software uses for it

Antivirus software may use HEX to detect samples with similarities. For example, they may have sample A and then they will add the MD5/SHA-256. What if there is a change in the file or another copy? This is where HEX is useful. It can be used to detect the same file without having the hash for each different copy of the file. It can also pick up other malware samples which HEX is in the ones you added to the database.


To explain this, I will open up a malware sample I have and show you an example. It won’t be packed, but I will show you more packers in the future and the unpacking is coming soon, probably within the next few days of when this thread was posted. With the unpacking series, you will learn to use tools like PE View and Ollydbg.



As we can see in the ASCII dump (right hand side), it says: C.o.p.y.r.i.g.h.t. .©. . .2.0.1.5...X.....

The equivalent to this in HEX is: 43 00 6F 00 70 00 79 00 72 00 69 00 67 00 68 00 74 00 20 00 A9 00 20 00 20 00 32 00 30 00 31 00 35 00 00 00 58 00 18 00 01 00


Basically, the copyright hasn’t been filled in. Suspicious? I am sure a legit application would have the copyright information filled in. Other suspicious signs could be a sample pretending to be another company, like Piriform, who provide CCleaner… Microsoft, or Adobe. I see those 3 companies being used quite often, especially Microsoft and Adobe, of course it could be any company, though.


Later tonight (or tomorrow), I will make a thread which will analyse files more in-depth with the HEX. I hope you don’t mind it will be on a separate thread. It will go over signs on key loggers, worms (mainly VBScripts), and other things… I will leave any found signatures out on the open, so anyone who wants to use them may feel free. (After this I will show you how to unpack files).

(I do not think any AV companies detect files with an unfilled copyright. This was a quick example of something suspicious. However, if there is no copyright you could also look more in-depth with the file detail for analysis which I will also speak about in the future. Dynamic analysis, I prefer, and I am sure you will find this more interesting when I go more in-depth after the disassembly for the static analysis which will be this week hopefully).


As usual, anything incorrect, let me know. I will shamely admit, there may be something incorrect for someone to spot especially on this thread. I got distracted a bit by someone. :( :D

Cheers. ;)
 
Last edited by a moderator:
I think I found an error.
A quick grid of binary I have made is as follows, if you have trouble reading it, let me know and I shall do a picture
Shouldn't you say an array? Because it looks like an array to me. Because the grid you posted is an Esri grid which can be more confusing to beginners.
 

h00lks

New Member
Hi everyone,

Previous article: http://malwaretips.com/threads/malw...ng-packers-and-detecting-a-packed-file.42356/

What you will be aware of and learn in this tutorial:
- An understanding of what a byte is
- An understanding on binary/machine code and how compilers work
- An understanding of what HEX is
- An understanding about HEX editors and how to use them
- A brief overview on how an Antivirus company may use HEX for signature-based detection of malware

Part 1 - Understanding what a byte is
A byte is 8 binary digits (in other words, "bits"). A bit is essentially 1 binary digit. When you have 8 binary digits, this is a byte. (I may continue this and expand this).

Part 2 - What is binary, machine code, and how do compilers work?
Binary is a numerical base 2 (8 bit) system. In binary, there are 2 values: 0 and 1. 0 represents OFF, 1 represents ON. It is useful for things like electrical signals.

A quick grid of binary I have made is as follows, if you have trouble reading it, let me know and I shall do a picture:

64
| 32 | 16 | 8 | 4 | 2 | 1 | 0
-------------------------------------------------------------------------------------------------------------------------------------------------------

If we look above, we start at 0, then 1, and then the number doubles. E.g. 2 * 2 = 4.
To represent we want to use this value, assign ON to it, we must use 1. To represent OFF, we use 0. 3 would be:

00000110



Do you see what I mean? We put OFF for 0 as we do not want it, we then put 1 for ON on 1 and 2. 1 + 2 = 3 (2 + 1 = 3).
We then put 0 for the rest numbers to say we do not want them included, OFF signal for them.


What is machine language:
Machine language is basically binary with hexadecimal allowed.

How compilers work:
The job of a compiler is to "compile" (translate) the code into another language. E.g. executable code.

Source code >> Compiler >> Machine code - (as an example).

That's enough on this for now.

Part 3 - Understanding what HEX is
HEX is a 16-base number system. When using HEX, the values go from 0 to 9. After this, you cannot go higher to 10, 11, and 12 directly with those numbers. It is then A to F.

0 - 9
A - F


A = 10
B = 11
C = 12
D = 13
E = 14
F = 15


I hope that A - F value info above helped you understand it.

For every 2 HEX digits you represent 8 binary digits, which as we learnt earlier, is a byte.


Part 4 - Understanding how to use HEX editors

For the guide, I will be using HxD hex editor, which can be downloaded from the following source: http://www.softpedia.com/get/Programming/File-Editors/HxD.shtml
Once downloaded and installed, open it up.

It should look like this:


This is the main GUI of HxD. To open a new window for HEX editing, you can go to File > New, or click the first document icon on the top bar under "File", "Edit", "Search", ... bar.

A document should look like this:


You can type in the right hand side and it will display the HEX according to the text on the left.



On the right hand side you have the ASCII dump:


And then of course the HEX on the left, as shown in the above examples. In the last part which is next in this guide, I will explain about how some Antivirus software might take advantage of HEX, how malware may try to avoid detection and a few signatures for any AV developer on here they may want to take a look at and be interested in.

Part 5 – More info on HEX and Antivirus software uses for it

Antivirus software may use HEX to detect samples with similarities. For example, they may have sample A and then they will add the MD5/SHA-256. What if there is a change in the file or another copy? This is where HEX is useful. It can be used to detect the same file without having the hash for each different copy of the file. It can also pick up other malware samples which HEX is in the ones you added to the database.


To explain this, I will open up a malware sample I have and show you an example. It won’t be packed, but I will show you more packers in the future and the unpacking is coming soon, probably within the next few days of when this thread was posted. With the unpacking series, you will learn to use tools like PE View and Ollydbg.



As we can see in the ASCII dump (right hand side), it says: C.o.p.y.r.i.g.h.t. .©. . .2.0.1.5...X.....

The equivalent to this in HEX is: 43 00 6F 00 70 00 79 00 72 00 69 00 67 00 68 00 74 00 20 00 A9 00 20 00 20 00 32 00 30 00 31 00 35 00 00 00 58 00 18 00 01 00


Basically, the copyright hasn’t been filled in. Suspicious? I am sure a legit application would have the copyright information filled in. Other suspicious signs could be a sample pretending to be another company, like Piriform, who provide CCleaner… Microsoft, or Adobe. I see those 3 companies being used quite often, especially Microsoft and Adobe, of course it could be any company, though.


Later tonight (or tomorrow), I will make a thread which will analyse files more in-depth with the HEX. I hope you don’t mind it will be on a separate thread. It will go over signs on key loggers, worms (mainly VBScripts), and other things… I will leave any found signatures out on the open, so anyone who wants to use them may feel free. (After this I will show you how to unpack files).

(I do not think any AV companies detect files with an unfilled copyright. This was a quick example of something suspicious. However, if there is no copyright you could also look more in-depth with the file detail for analysis which I will also speak about in the future. Dynamic analysis, I prefer, and I am sure you will find this more interesting when I go more in-depth after the disassembly for the static analysis which will be this week hopefully).


As usual, anything incorrect, let me know. I will shamely admit, there may be something incorrect for someone to spot especially on this thread. I got distracted a bit by someone. :( :D

Cheers. ;)

tnks a lot of the good info :) I like it !!
 
Status
Not open for further replies.
Top