Malware Analysis PyPI Malware Creators are starting to Employ Anti-Debug techniques

upnorth · Jan 3, 2023

Quote: " Most PyPI malware today tries to avoid static detection using various techniques: starting from primitive variable mangling to sophisticated code flattening and steganography techniques. Use of these techniques makes the package extremely suspicious, but it does prevent novice researchers from understanding the exact operation of the malware using static analysis tools. However - any dynamic analysis tool, such as a malware sandbox, quickly removes the malware’s static protection layers and reveals the underlying logic.

Recently, it seems that attackers have stepped up a notch – we’ve recently detected and disclosed the cookiezlog package which seemed to employ Anti-debugging code (designed to thwart dynamic analysis tools) in addition to regular obfuscation tools and techniques. This is the first time our research team (or any publication) have spotted these kinds of defenses in PyPI malware. In this post, we will give an overview of the techniques used in this Python malware and how to unpack similar malware.

Similar to most malicious packages, the cookiezlog package runs immediately upon installation. This is achieved via “develop” and “install” triggers in setup.py –

Code:

class PostDevelopCommand(develop):
    def run(self):
        execute()
        install.run(self)
 
 
class PostInstallCommand(install):
    def run(self):
        execute()
        install.run(self)
 
...
 
setup(
    name='cookiezlog',
    version='0.0.1',
    description='Extra Package for Roblox grabbing',
    ...
    cmdclass={
        'develop': PostDevelopCommand,
        'install': PostInstallCommand,
    },
)

The first and simplest layer of protection is zlib-encoded code, which is executed immediately after the package is installed. The decoded payload downloads a file from a hardcoded URL and executes it on the victim’s machine. The executable is a Windows PE file. Looking at the strings in the executable, we can see that it’s not actual native code but rather a Python script packed into the PE format. It can be quickly unpacked with the open-source tool PyInstaller Extractor. The extracted code contains a lot of files, primarily third-party libraries. The most interesting extracted file is main.pyc, which contains the malware code as Python bytecode.

Normally, we would be able to decompile the bytecode in main.pyc to Python source code, using tools such as uncompyle6. However, in this case, another run of strings on main.pyc shows that the binary has been obfuscated with PyArmor. PyArmor is a commercial packer and obfuscator, which applies obfuscation techniques to the original code, encrypts it and protects it from analysis. Fortunately for the researchers, PyArmor keeps much of the information that’s necessary for introspection. Knowing this, we can try to restore the names of the functions and constants used in the original code. Although PyArmor does not have any publicly-available unpacker, it can be fully unpacked with some manual effort. In this case, we chose to perform a quick unpacking shortcut (by using library injection) since we were mostly interested in the original symbols and strings. "

Quote: " We created our own file named psutil.py in the same directory as the protected file (main.pyc) with the following code. The snippet uses the inspect module, which allows to get a runtime information about the code being executed: it iterates over execution frames and extracts the names of the code blocks and referenced constants. After running our snippet, it returned a list of strings that allowed us to discern the capabilities and origin of the malicious code. The most interesting strings were the URL of an injection module, pointing to the possible attacker’s repository, and references to anti-VM functionalities in the code.

The Syntheticc GitHub profile mentioned in the strings was still available at the time of writing. The profile’s repositories contain a bunch of open-source hacking tools. Among others there was a repository called “Advanced Anti Debug”, containing methods that could be used to prevent analysis of the malware. We can split the dynamic methods the malware used into two categories: Anti-Debug and Anti-VM. The Anti-Debug checks look for suspicious system activity related to any debuggers or disassemblers and includes the following functions: check_processes looks whether debugger process runs on the system – comparing the active process list to the list of over 50 known tools, including -

“idau64.exe” (IDA Pro Disassembler)
“x64dbg.exe” (x64dbg Debugger)
“Windbg.exe” (WinDbg Debugger)
“Devenv.exe” (Visual Studio IDE)
“Processhacker.exe” (Process Hacker)

Code:

PROCNAMES = [
    "ProcessHacker.exe",
    "httpdebuggerui.exe",
    "wireshark.exe",
    "fiddler.exe",
    "regedit.exe",
...
]
 
for proc in psutil.process_iter():
    if proc.name() in PROCNAMES:
        proc.kill()

check_research_tools has almost the same functionality, comparing substrings of process names to a humble list of five traffic analysis tools:

“wireshark” (Wireshark network protocol analyzer)
“fiddler” (Fiddler proxy)
“http” (HTTP Debugger and possibly more tools)
“traffic” (generic term)
“packet” (generic term)

If any of these processes are found to be running, the Anti-Debug code tries to kill the process via psutil.Process.kill – not a very subtle approach. Malware that is more stealth-conscious would just stop running without any indication, instead of interacting with external processes. The other anti-debug techniques try to make sure the malware is not running inside a virtual machine. "

Quote: " All of the checks mentioned above are relatively simple, but with the respectable protection against static analysis the malware already employed, it offers adequate protection against novice researchers – especially ones who only use automated analysis tools which wouldn’t be able to breach the defenses of this specific malware. The payload is disappointingly simple compared to the amount of defenses used by the malware, but it is still harmful. The payload is a password grabber, which gathers “autocomplete” passwords saved in the data caches of popular browsers and sends them to the C2 server (in this case a Discord hook. From the strings extracted from the malware we can deduce that in addition to the “industry standard” Discord token leaker functionality, the payload also hunts for passwords of several financial services as can be seen by strings used by the send_info function.

Code:

Name: send_info
Filename:
Argument count: 0
...
Constants:
0: None
1: 'USERPROFILE'
...
5: 'coinbase'
...
7: 'binance'
...
9: 'paypal'
...

Summary

Just a couple of years ago, the only tools that PyPI malware authors used were simple payload encoders. Today we see that malware that’s uploaded to OSS repositories is becoming more complex, has a few levels of static and dynamic protection and utilize combinations of commercial and homebrew tools. This is similar to their “colleagues” in the world of native malware and as such we are expecting OSS-repo malware to continue to evolve, perhaps with advanced techniques such as custom polymorphic encoding and deeper anti-debug methods. "

Full source:

Python Malware Starting to Employ Anti-Debug Techniques

First time anti-debug techniques are discovered in PyPI malware. Read how these techniques are implemented, including analysis and tips from JFrog Security Research.

jfrog.com

Search

Malware Analysis PyPI Malware Creators are starting to Employ Anti-Debug techniques

upnorth

Level 68

Python Malware Starting to Employ Anti-Debug Techniques

Similar threads