Dissect Packed Malware 101

Understand how packers compress or encrypt malicious code to conceal its true functionality and bypass detection.

Table of contents


Layers of Evasion and Defense

First of all, what it is a packer? A packer is a tool that compresses and encrypts executable files to hide their true code. Lawfully are used to protect software from piracy and reverse engineering in order to safeguard intellectual property. In cyber threats, attackers use packers to conceal malware, making it difficult for EDRs and analysts to detect or analyze it. Packed malware unpacks itself only during execution, evading static detection methods, increasing stealth, complicates reverse engineering and allows malware to bypass security defenses. Packers can be standard or custom-built, often combined with other evasion tactics.

image


Packed Malware Anatomy and Unpacking Process

Shown below an overview of the same sample, left in its original form, right in its packed form. The executable’s entry point point to different location, in the packed version it directs to the unpacking stub rather than the original code. Nonpacked executables are loaded directly by the operating system. In contrast, for packed programs, the OS loads a small unpacking stub first, which then loads the original program. The stub is usually minimal and its sole purpose is to unpack the original program. Static analysis of a packed file only analyzes the stub, not the original code.

image

The unpacking stub performs three main tasks:

  • unpack (load) the original executable into memory
  • resolve its imports
  • transfer execution to the original entry point

Loading in Memory

The process to load in memory it’s basically the same, but differ from a single thing, who/what loads in memory needed code. When not-packed executables are loaded in memory, the loader reads the PE header from the disk and allocate memory for each section, then it transfers the sections into the allocated memory areas. For packed executables, the PE header is also structured to ensure the loader allocates space for the sections, which may either come from the original program or be generated by the unpacking stub. The unpacking stub decompresses/decrypts the code for each section and places it into the designated memory locations. The key difference is that, for packed executables, the loader initially loads a small unpacking stub instead of the original code. This stub is responsible for unpacking the actual executable code into memory, resolving imports, and then transferring control to the original program’s entry point.

Resolve Imports

The Windows loader cannot interpret import information when the executable file it is packed, the unpacking stub is responsible for resolving imports. Typically, the stub imports only the essential functions LoadLibrary and GetProcAddress. After unpacking the original program, it reads the original import table, calls LoadLibrary to load each required DLL, and then uses GetProcAddress to locate each imported function.

Restore the OEP

Another important and crucial task that is performed during sample unpacking is the restore of the Original Entry Point. To do so, the sample must transfer execution to the OEP, and the best instruction that transfer execution to another portion of code is jump. This instruction is so called “tail jump” and it is the most important thing that an analyst must recognize and intercept in order to unpack a packed malware.

This pseudocode highlights how the packer hides the malicious code until it is unpacked in memory during execution. The crucial steps include the stub that loads the original code into memory, resolves its imports, and finally transfers control to the Original Entry Point (OEP) through a “tail jump”.

// Phase 1: Operating System Loading of the Packed Executable
FUNCTION LoadAndExecuteProgram(ExecutableFilePath)
    IF ExecutableFile IS **packed** THEN
        // The executable's entry point directs to the **unpacking stub** rather than the original code
        LoadUnpackingStubIntoMemory()
        ExecuteUnpackingStub()
    ELSE
        // **Non-packed executables** are loaded directly by the operating system
        LoadOriginalExecutableIntoMemory()
        TransferExecutionToOriginalEntryPoint()
    END IF
END FUNCTION

// Phase 2: Execution of the Unpacking Stub
FUNCTION ExecuteUnpackingStub()
    // The unpacking stub performs:
    // 1. Unpack (load) the original executable into memory
    // 2. Resolve its imports
    // 3. Transfer execution to the original entry point

    // Task 1: Unpack (load) the original executable into memory
    1. ReadPEHeader() // The PE header is structured to ensure the loader allocates space for sections
    2. FOR EACH Section IN OriginalExecutable
        **DecompressOrDecryptCode**(Section) // The unpacking stub decompresses/decrypts the code for each section
        PlaceCodeIntoAllocatedMemory(Section) // and places it into the designated memory locations 
    END FOR

    // Task 2: Resolve Imports
    // The Windows loader cannot interpret import information when the executable file is packed
    // The unpacking stub is responsible for resolving imports
    3. ImportEssentialFunctions(**LoadLibrary**, **GetProcAddress**) // Typically, the stub imports only these essential functions
    4. ReadOriginalImportTable() // After unpacking the original program, it reads the original import table
    5. FOR EACH RequiredDLL IN OriginalImportTable
        Call(**LoadLibrary**, DLLName) // To load each required DLL
        FOR EACH ImportedFunction IN DLL
            Call(**GetProcAddress**, DLLAddress, FunctionName) // To locate each imported function
        END FOR
    END FOR

    // Task 3: Transfer execution to the Original Entry Point (OEP)
    // This is an important and crucial task performed during sample unpacking 
    6. IdentifyOriginalEntryPoint(**OEP**) // Identify the Original Entry Point
    7. PerformJump(OEP) // This is the instruction so-called "tail jump"
                            // The one that we will look for
                            // in order to unpack a packed malware

END FUNCTION

// Phase 3: Execution of the Original Program
// At this point, control has been transferred to the unpacked original program.
// The original program can now be analyzed 

How Identify Packed Samples

When malware is packed, analysts usually only have access to the packed file and cannot directly examine the original unpacked program or the tool that packed it. To unpack the executable, analysts must reverse the packing process, which requires a thorough understanding of how the specific packer functions. This knowledge is essential to effectively restore the malware to its original form for analysis.

image

Packed Software Indicators

if someone of this are highlighted looking at a suspicious sample, it may be a packed one.

  • Bin sections with strange names;
  • Huge gap between the raw size and the virtual size of the bin sections;
  • Highlighted the presence of a packed sample by debuggers and tools such as PEiD;
  • Few readable strings, sometimes none;
  • Few imports, sometimes only LoadLibrary and GetProcAddress;

Automated Unpacking

Unpack a packed software it can be done by related unpackers. Most are open source and community maintained. De4dot (e.g.) is a tool used for this purpose, deobfuscating/unpacking .NET samples. Unpac.me is another online tool that can be used to automate unpacking process. Simply giving to the unpacker software the packed one and voilà, the unpacking process is done. Simple as boring, no juicy example here. But tools and open source unpackers don’t work always, skip to the next section to understand why.


Manual Unpacking

image

Sometimes (more often than we think), unpack malwares is not simple and automatic as we hope, so it must be necessary try to unpack manually. Malware creators have a wide range of packers to choose from. Some packers are legitimate commercial tools designed for regular software developers, while others are specifically developed to aid malicious software. Some malware is protected by custom packers, for this reason its important understand how a packer works in order to reverse what it did on the malware and extract the evil inside of it.

Common procedure:

  1. find tailjump or juicy instructions and reach the “real” OEP;
  2. dump the evil within;
  3. repair IAT;
  4. restore OEP;

Let’s look at different ways to do manual unpacking of a malware packed, starting from something simple to something more complex.


Specimen n1

🦠🔍 Last jump before a lot of zer0s

image

Let’s start to manually unpack a malware that is packed with UPX, which is free, common and relatively simple to unpack. UPX includes its own unpacking functionality, so if malware is compressed with a standard version of UPX, we can use the UPX utility to extract the original file. However, malware developers sometimes modify UPX or alter the header of the packed executable to hinder unpacking. Unlike UPX, most other packers lack integrated unpacking features, making it crucial to know how to handle various types of packed software.

image

By loading the sample into SpeakEasy we can see that the presence of the UPX packer (modified version) is identified, if we tried to use UPX to unpack it, we would fail. Manual unpacking is therefore necessary.

UPX packer is famous to unpack packed code after a jump instruction right before some null values:

image

Viewing the contents of the memory segment before the execution of the jmp instruction (our tail jump) we notice that it is not populated

image

upon hitting bpx setted we notice that the memory segment to which our tail jump points is now full of instructions, probably (it is) our binary now loaded (unpacked) correctly into memory

image

we indeed observe a function epilogue (highlighted), aimed to save the stack pointer.

image

Inspecting function calls present the memory region we are in we have several pieces of information that lead us to think we are correctly in the now unpacked portion of the binary code. Otherwise we would not have so many calls, especially not so clearly readable ones.

image

Now that we have the unpacked sample loaded into memory, we need to dump it and restore IAT and OEP. How? with the plugin OllyDumpEx we can feed it the binary dump and restore the OEP by matching it to the instruction pointer.

image

then, with another tool named Scylla we can restore the IAT and fix the process dumped

image

Once the 3 steps are done: unpack the binary (1) and restore OEP (2) an IAT (3), we can now start to analyze the sample as we want. Comparing the two different samples, the version packed and the one unpacked, we can notice visible differences, highlighting clear information in the unpacked version.

image

the morphology of the binary sections is an indicator (in this case) of a packed sample, in addition to the fact that we have a section called UPX2, we can see the large gap between raw size and virtual size of the bin sections, also an indicator of a packed sample. In the version of the sample following the unpacking process we can see how this discrepancy is diminished.

image

As indicated at the beginning, the unpacking of this sample was not complicated. We continue by experimenting with another type of packer packed sample.


Specimen n2

🦠🔍 Let’s jump out of section boundaries

image

This could be classified as “easy,” but since you need a particular tool to make it “super easy” (which I do not have), the process explained below will instead follow a “less easy” path.

image

as we can see there is a huge discrepancy between the raw size and the virtual size of the sections, also the second section has a high entropy value, another indicator to watch out for.

What are section boundaries?

Section boundaries refer to the in-memory start and end addresses of a specific section within a Portable Executable (PE) file, such as .text, .data or .rsrc. Each section of a PE file contains code, data or resources and has a defined location and size when the executable is loaded into memory. This unpacking process can be completed automatically and quickly by using an OllyDump plugin or using Trace over/into with break condition that identifies when there is an instruction pointing to a section “out of section boundaries” of the section that EIP is currently in.

Let’s breakdown the concept:

Executables are divided into sections (e.g., .text.data), each with defined memory boundaries. These boundaries are calculated as: - StartSection Virtual Address (VA) + Image Base - EndStart + Section Virtual Size

image

if you googling how to find EOP by section hop, you may encounter this forum (https://forum.exetools.com/showthread.php?t=18603) that talks about this particular thing. Summarizing what is present inside of the forum, we can assume that out0 and out1 are dynamically determined based on the current section where the instruction pointer resides. Here’s how to interpret them:

  • out0: The start address of the current PE section in memory.
  • out1: The end address of the current PE section in memory.

These values are derived from the Portable Executable (PE) structure of the analyzed binary. For example, if the current section is .textout0 and out1 define its memory boundaries.

image

Based on the image above, let’s assume some calculation:

  1. Section Start (out0):
    out0 = Image Base + Section Virtual Address (VA)
    • The Image Base is the address where the executable is loaded in memory (e.g., 0x0000).
    • The Section VA is the relative address of the section within the PE file (e.g., .SECTION Y might have a VA of 0x0201 ).
  2. Section End (out1):
    out1 = out0 + Section Virtual Size
    • The Virtual Size specifies how much memory the section occupies after being loaded (e.g., 0x0099 bytes).

So, if the .SECTION A section has:

  • Image Base0x0000
  • Section VA0x0201
  • Virtual Size0x0099 Then:
  • out0 = 0x0000 + 0x0201 = 0x0201
  • out1 = 0x0201 + 0x0099 = 0x0300

image

Example and explanation were only for demonstration purposes and not related to our example, but these are our section boundaries. Looking at our sample in the debugger, we then notice that EIP points to the address 0x405000

image

By switching under the memory map we can see the section boundaries of the memory section where our EIP resides (which is currently also the OEP)

image

Based on our sample, we can say:

  • Image Base0x405000
  • Section VA0x000000
  • Virtual Size0x001000 Then:
  • out0 = 0x405000 + 0x000000 = 0x405000
  • out1 = 0x405000 + 0x001000 = 0x406000

In unpacking scenarios, malware decrypts its payload into a new memory section. By tracing until EIP leaves the current section (e.g., a packed .text section), we can identify the Original Entry Point (OEP) of the unpacked code. Based on this we can assume that if the OEP jump over the condition eip < out0 || eip >= out1 maybe we have encountered our tail jump.

Without the OllyDump function that find automatically section hops, we can set a trace over with this particular condition and wait till the EIP reach another section out of our boundaries.

image

In this way we reached the (unpacked) binary loaded in memory.

image

Let’s take a look to the Memory Map and see where we are now. Once loaded the binary into the disassembler, EIP was between the section buondaries 0x405000-0x406000.

image

the tail jump pointed to this memory reason that was not populated before the code execution (see below). We can then add another unpacking technique to our deck: identify “other memory section jump.”

image

Comparing the differences between the two samples we can notice several things, including the imported functions. In the packed sample (right circled in red) we can notice the presence of functions such as LoadLibraryA and GetProcAddress, known to be used by the unpacking stub to do unpacking of the actual code.

image

Once dumped unpacked code and restored IAT and OEP we can now see the presence of other functions within the imports, helping us to understand the capabilities of the sample under analysis.

With the correct tool, jump to another memory section than the one where the OEP (of the packer) is, they can be identified with speed and simplicity. This is not always the case, so it is necessary to intercept these “tail jumps” with trace conditions and the right breakpoint. Of course, it is not necessarily the case that the unpacking stub will unpack the code on the first jump to another section of memory “over the OEP boundaries,” but it is potentially likely.


Specimen n3

🦠🔍 Follow Juicy-Assembly-Instruction down the white rabbit unpacking hole

image

In this case we do not have zeros after a jump, nor do we have jumps to sections of memory outside the limits of the section in which the OEP is located.

image

The sample looks like this, so it is not easy to identify the real OEP. What we can use to our advantage, however, are the pushfd and pushad instructions.

What do pushfd and pushad instructions?

  • pushad pushes all 8 general-purpose 32-bit registers onto the stack in a specific order: EAX, ECX, EDX, EBX, ESP, EBP, ESI, and EDI. This effectively saves the entire CPU register state at that moment;
  • pushfd pushes the EFLAGS register (which contains the CPU flags) onto the stack, preserving the processor’s flag state.

So, what our next move will be, set a breakpoint somewhere? but where exactly? Let’s try to crack the unpacking process and figure out where to set our next breakpoint.

Most packers use pushad near the entry point of the unpacking stub to save the CPU state before starting the unpacking/decryption routine. This is because the unpacking process modifies registers heavily, but the original register values are needed later to restore the program’s state before jumping to the original, unpacked code (our real OEP). So, describing what is the unpacking process that this kind of packed malware will follow:

  1. pushad saves all registers on the stack and pushfd saves all the flags.
  2. The unpacking routine runs, modifying registers and memory.
  3. The registers are restored later with a corresponding popad and popfd returning the CPU to the state it had before unpacking.
  4. The program then jumps to the OEP.

It’s like what functions do when we are in function epilogue, where ESP state is saved and then restored before return to the function caller.

It’s time to open our friendly debugger. What we can do is to identify the corresponding memory region following the execution of the two instructions we just explained, and then set a breakpoint so that we can observe when it is potentially hits near our OEP. The register we need to consider in this case is ESP, I’m not going to explain why (google it!), by following it in memory we can identify the memory region to which the stack points, so if we understood what was described earlier, this is where we are interested in observing potential state/value changes. More than changes, however, are potential accesses to this memory region; we know that these two instructions are used to save the stack states and register flags to be resumed shortly after the unpacking process is completed.

image

Now we are right where we want to be, we have to set a breakpoint at the address in the memory region where we are now, but not just a simple breakpoint, a hardware breakpoint at memory access. What does this mean? unlike the normal breakpoint that is used when debugging malware where the “INT 3” instruction is simply entered, the hardware breakpoint turns out to be much safer and more accurate because a dedicated registers for HWBPXs is used and the “on the fly” code is not changed, compromising effectiveness the moment the memory region involved undergoes considerable changes.

image

The dedicated register takes the desired value. (Sound familiar? otherwise go read how these registers are exploited by malware to perform defense evasion, but more importantly how they can be exploited to bypass AMSI).

Following the malware run, we hit the HWBPX as the affected memory region is “touched”:

image

Returning to the function caller we have our malware unpacked and ready to be dumped, restoring IAT and OEP.

image

We note the obvious difference between the track sections (right following manual unpacking)

image

Imports much more obvious and clear

image


Specimen n4

🦠🔍 Eat, Sleep, Breakpoint Hit, Repeat

image

For this unpacking method we will simply let the malware do what it wants to do and observe its behavior to identify the exact moment when it is unpacking the malicious code in memory.

Identifying the exact moment is not simple at all, what we can do is rely on the unpacking process by closely observing API calls such as LoadLibrary and all its variations.

image

We then put a breakpoint (sw is fine too) to all calls to LoadLibrary and variations, and run the malware.

image

On the first hit we may have a situation similar to the one below:

image

We then set a HW BPX (yes, again in memory - access) at the offset value in ESP, this operation, as with the previous sample, is needed since we are interested in verifying exactly when the selected memory region is touched.

We then note that DR0 takes the value of the affected offset.

image

hint: corresponds to the popfd instruction, so to restore registers as seen for the previous sample.

image

Let’s run the malware and have it execute without changing its behavior, we notice that our breakpoint is hitted several times, each time via LoadLibrary and GetProcAddress the address of the loaded library functions is obtained-the malware is rebuilding the IAT.

HW BPX hit #2 image

HW BPX hit #30 image

HW BPX hit #6843 image

HW BPX hit #8712348 image

our sample loads 3 more .dll’s into memory, which is why the breakpoint (which you can delete) on the call to LoadLibrary is still hittled. After a series of hits (precisely 105), we see the light, our unpacked code

image

image

Indeed, we can see that the two versions (packed and unpacked) , according to the parameters that can be used to easily intercept the difference between a packed malware from an unpacked one, are very obvious and different:

image

One major challenge is pinpointing the exact moment when the original malicious code is fully unpacked in memory, unpacking malware using dynamic monitoring techniques such as setting memory access breakpoints on ESP after a LoadLibrary call enables to bypass packing and obfuscation layers that are specifically designed to evade simpler automated tools, and can be a painful and labor-intensive process, but it remains necessary for thorough analysis. This dynamic approach is slow and can be prone to errors, requiring multiple breakpoints, careful tracking of memory allocations, and significant manual effort to follow the unpacking routine through its stages, however, despite its complexity and the time consumption involved, this hands-on process is often needed.


THE END

till now

The process of malware unpacking is an essential preliminary step in malware analysis that enables security researchers to reveal the underlying malicious payload and plays a crucial role in reversing the obfuscation techniques employed by malware authors to evade detection. Malware often arrives packed or encrypted, hiding its true code and behavior from static analysis tools, this is why is so important how to unpack them and highlight its real nature.

image

Combining manual expertise with automated techniques enhances precision and efficiency, helping to combat increasingly sophisticated packing and evasion methods used by malware authors.


RAM boost

Add some RAM banks to your workstation listening this Spotify playlist during the reading / unpacking!


If you like this post, please follow me on Xitter and remember, 5haring is caring!

Share: X (Twitter) LinkedIn