Hacking — Best OF Reverse Engineering — Part 8

Malware Reverse Engineering

In today’s highly sophisticated world in Technology, where Information Systems form the critical back-bone of our everyday lives, we need to protect them from all sorts of attack vectors.

Protecting them from all sorts of attack would require us understanding the modus operandi without which our efforts would be futile. Understanding the modi operandi of sophisticated attacks such as malware would require us dissecting malware codes into bits and pieces with processes such as Reverse Engineering. In this article, readers will be introduced to Reverse Engineering, Malware Analysis, Understanding attack vectors from reversed codes, and tools and utilities used for reverse engineering.


Reverse engineering is a vital skill for security professionals. Reverse engineering malware to discovering vulnerabilities in binaries are required in order to properly secure Information Systems from today’s ever evolving threats.

Reverse Engineering can be defined as “Per Wikipedia’s definition: http://en.wikipedia.org/wiki/Reverse_engineering:Reverse engineering is the process of discovering the technological principles of a device, object or system through analysis of its structure, function and operation. It often involves taking something (e.g., a mechanical device, electronic component, biological, chemical or organic matter or software program) apart and analyzing its workings in detail to be used in maintenance, or to try to make a new device or program that does the same thing without using or simply duplicating (without understanding) the original. Reverse engineering has its origins in the analysis of hardware for commercial or military advantage. The purpose is to deduce design decisions from end products with little or no additional knowledge about the procedures involved in the original production. The same techniques are subsequently being researched for application to legacy software systems, not for industrial or defense ends, but rather
to replace incorrect, incomplete, or otherwise unavailable documentation.”

Assembly language is a low-level programming language used to interface with computer hardware. It uses structured commands as substitutions for numbers allowing humans to read the code easier than looking at binary, though it is easier to read than binary, assembly language is a difficult language and comes in handy as a skill set for effective reverse engineering. For this purpose, we will delve into the basics of assembly language;

Register is a small amount of storage available on processors which provides the fastest access data. Registers can be categorized on the following basis:

* User-accessible registers — The most common division of user-accessible registers is into data registers and address registers.

* Data registers can hold numeric values such as integer and floating-point values, as well as characters, small bit arrays and other data. In some older and low end CPUs, a special data register, known as the accumulator, is used implicitly for many operations.

* Address registers hold addresses and are used by instructions that indirectly access primary memory. Some processors contain registers that may only be used to hold an address or only to hold numeric values (in some cases used as an index register whose value is added as an offset from some address); others allow registers to hold either kind of quantity. A wide variety of possible addressing modes, used to specify the effective address of an operand, exist. The stack pointer is used to manage the run-time stack. Rarely, other data stacks are addressed by dedicated address registers, see stack machine.

* Conditional registers hold truth values often used to determine whether some instruction should or should not be executed.

* General purpose registers (GPRs) can store both data and addresses, i.e., they are combined Data/Address registers and rarely the register file is unified to include floating point as well.

* Floating point registers (FPRs) store floating point numbers in many architectures.

* Constant registers hold read-only values such as zero, one, or pi.

* Vector registers hold data for vector processing done by SIMD instructions (Single Instruction, Multiple Data).

* Special purpose registers (SPRs) hold program state; they usually include the program counter
(aka instruction pointer) and status register (aka processor status word). The aforementioned stack
pointer is sometimes also included in this group. Embedded microprocessors can also have registers
corresponding to specialized hardware elements.

* Instruction registers store the instruction currently being executed. In some architectures, model- specific registers (also called machine-specific registers) store data and settings related to the processor itself. Because their meanings are attached to the design of a specific processor, they cannot be expected to remain standard between processor generations.

* Control and status registers — There are three types: program counter, instruction registers and program status word (PSW).

Registers related to fetching information from RAM, a collection of storage registers located on separate chips from the CPU (unlike most of the above, these are generally not architectural registers).

Assembly Language function starts a few lines of code at the beginning of a function, which prepare the stack and registers for use within the function. Similarly, the function conclusion appears at the end of the function, and restores the stack and registers to the state they were in before the function was called.

Memory Stacks
There are 3 main sections of memory:

* Stack Section — Where the stack is located, stores local variables and function arguments.

* Data Section — Where the heap is located, stores static and dynamic variables.

* Code Section — Where the actual program instructions are located.

The stack section starts at the high memory addresses and grows downwards, towards the lower memory addresses; conversely, the data section (heap) starts at the lower memory addresses and grows upwards, towards the high memory addresses. Therefore, the stack and the heap grow towards each other as more variables are placed in each of those sections.

Debuggers are computers programs used for locating and fixing or bypassing bugs (errors) in computer program code or the engineering of a hardware device. They also offer functions such as running a program step by step, stopping at some specified instructions and tracking values of variables and also have the ability to modify program state during execution. some examples of debuggers are:

* GNU Debugger

* Intel Debugger


* Microsoft Visual Studio Debugger

* Valgrind

* WinDbg

Hex Editors
Hex editors are tools used to view and edit binary files. A binary file is a file that contains data in machinereadable form as opposed to a text file which can be read by a human. Hex editors allow editing the raw data contents of a file, instead of other programs which attempt to interpret the data for you. Since a hex editor is used to edit binary files, they are sometimes called a binary editor or a binary file editor.

Disassemblers are computer programs that translate machine languages into assembly language, whilst the opposite for the operation is called an assembly. The outputs of Disassemblers are in human readable format. Some examples are:


* OllyDbg

Malware is the Swiss-army knife used by cybercriminals and any other adversary against corporation
or organizations’ Information Systems.

In these evolving times, detecting and removing malware artifacts is not enough: it’s vitally important
to understand how they work and what they would do/did on your systems when deployed and understand the context, the motivations and the goals of a breach.

Malware analysis is accomplished using specific tools that are categorized as hex editors, disassemblers/ debuggers, decompiles and monitoring tools.

Disassemblers/debuggers occupy important position in the list of reverse engineering tools. A disassembler converts binary code into assembly code. Disassemblers also extract strings, used libraries, and imported and exported functions. Debuggers expand the functionality of disassemblers by supporting the viewing of the stack, the CPU registers, and the hex dumping of the program as it executes. Debuggers allow breakpoints to be set and the assembly code to be edited at runtime.

Zeus is a malware toolkit that allows a cybercriminal to build his own Trojan horse for the sole purpose of stealing financial details.

Once Zeus Trojan infects a machine, it remains idle until the user visits a Web page with a form to fill
out. It allows criminals to add fields to forms at the browser level. This means that instead of directing the end user to a counterfeit website, the user would see the legitimate website but might be asked to fill in an additional blank with specific information for “security reasons.”

The malware can be customized to gather credentials from banks in specific geographic areas and can be distributed in many different ways, including email attachments and malicious Web links. Once infected, a PC can be recruited to become part of a botnet.

For reverse engineering malware a controlled environment is suggested to avoid sprawling of malicious content or using a virtual network that is completely enclosed within the host machine to prevent communication with the outside world. Tools such as PE, Disassemblers, Debuggers, etc would also be required to effectively reverse malwares.

Zeus Crimeware Toolkit
This is a set of programs which is designed to setup a botnet over networked infrastructure. It aims to make machines agents with the mission of stealing financial records. Zeus has the ability to log inputs entered by the user as well as to capture and manipulate data that are displayed on web forms.

The structure of Zeus crimeware toolkit is made up of five components namely;

* A control panel which contains a set of PHP scripts that are used to monitor the botnet and collect the stolen information into MySQL database and then display it to the botmaster. It also allows the botmaster to monitor,control, and manage bots that are registered within the botnet.

* Configuration files that are used to customize the botnet parameters. It involves two files: the
configuration file config.txt that lists the basic information, and the web injects file webinjects.txt that identifies the targeted websites and defines the content injection rules.

* A generated encrypted configuration file config.bin, which holds an encrypted version of the configuration parameters of the botnet.

* A generated malware binary file bot.exe, which is considered as the bot binary file that infects the victims’ machines.

* A builder program that generate two files: the encrypted configuration file config.bin and the malware (actual bot) binary file bot.exe. On the Command&Control side, the crimeware toolkit has an easy way to setup the Command&Control server through an installation script that configures the database and the control panel. The database is used to store related information about the botnet and any updated reports from the bots. These updates contain stolen information that are gathered by the bots from the infected machines. The control panel provides a user friendly interface to display the content of the database as well as to communicate with the rest of the botnet using PHP scripts. The botnet configuration information is composed of two parts: a static part and a dynamic part. In addition, each Zeus instance keeps a set of targeted URLs that are fed by the web injects file webinject.txt. Instantly, Zeus targets these URLs to steal information and to modify the content
of specific web pages before they get displayed on the user’s screen. The attacker can define rules that are used to harvest a web form data. When a victim visits a targeted site, the bot steals the credentials that are entered by the victim. Afterward, it posts the encrypted information to a drop location that is meant to store the bot update reports. This server decrypts the stolen information and stores it into a database.

Code Analysis
The builder is part of the component in the crimeware toolkit which uses the configuration files as input to obfuscated configuration and the bot binary file.

The configuration File: It converts the clear text of the configuration files to a pre-defined format and
encrypts it with RC4 encryption algorithm using the configured encryption key.

Zeus Configuration file includes some commands namely:

* url_loader: Update location of the bot

* url_server: Command and control server location

* AdvancedConfigs: Alternate URL locations for updated configuration files

* Webfilters: Web filters specify a list of URLs (with masks) that should be monitored. Any data sent to these URLs such as online banking credentials is then sent to the command and control server. This data is captured on the client prior to SSL. In addition, one can specify to take a screenshot when the leftbutton of the mouse is clicked, which is useful in recording PIN numbers selected on virtual keyboards.

* WebDataFilters: Web data filters specify a list of URLs (with masks) and also string patterns in the data that must be matched. Any data sent to these URLs and match the specified string patterns such as ‘password’ or ‘login’ is then sent to the command and control server. This data is also captured on the client prior to SSL.

* WebFakes: Redirect the specified URL to a different URL, which will host a potentially fake version of the page.

* TANGrabber: The TAN (Transaction Authentication Number) grabber routine is a specialized routine that allows you to configure match patterns to search for transaction numbers in data posted to online banks.The match patterns include values such as the variable name and length of the TAN.

* DNSMap: Entries to be added to the HOSTS file often used to prevent access to security sites or redirect users to fake Web sites.

* file_webinjects: The name of the configuration file that specifies HTML to inject into online banking pages, which defeats enhanced security implemented by online banks and is used to gather information not normally requested by the banks. This functionality is discussed more in-depth in the section “Web Page Injection”.

The ZEUS trojan captures your keystrokes and implements ‘form grabbing’ (taking the contents of a form before submission and uploading them to the attacker) in an effort to steal sensitive information (passwords, credit cards, social securities, etc.). It has capabilities to infect Windows and several mobile platforms, though a recent variant based on ZUES’s leaked source, the Blackhole exploit kit, can infect Macs as well.

Zeus is predominantly a financial-interest malware, however if infected, your machine will be recruited into one of the largest botnets ever. The master could then use your computer (along with any other infected machines of that bot) to be used to do any number of nefarious tasks for him (launching DDOS attacks, sending spam, relays, etc.).

Originally published at https://learncybersec.blogspot.com.

Cyber Security Analyst & researcher