Table of contents
- Layers of Evasion and Defense
- Packed Malware Anatomy and Unpacking Process
- How Identify Packed Samples
- Automated Unpacking
- Manual Unpacking
- 🦠🔍 Specimen #1 - Last jump before a lot of zer0s
- 🦠🔍 Specimen #2 - Let’s jump out of section boundaries
- 🦠🔍 Specimen #3 - Follow Juicy-Assembly-Instruction down the white rabbit unpacking hole
- 🦠🔍 Specimen #4 - Eat, Sleep, Breakpoint Hit, Repeat
- Conslusions
- RAM boost for unpacking operations
Layers of Evasion and Defense
First of all, what it is a packer? A packer is a tool that compresses and encrypts executable files to hide their true code. Lawfully are used to protect software from piracy and reverse engineering in order to safeguard intellectual property. In cyber threats, attackers use packers to conceal malware, making it difficult for EDRs and analysts to detect or analyze it. Packed malware unpacks itself only during execution, evading static detection methods, increasing stealth, complicates reverse engineering and allows malware to bypass security defenses. Packers can be standard or custom-built, often combined with other evasion tactics.
Packed Malware Anatomy and Unpacking Process
Shown below an overview of the same sample, left in its original form, right in its packed form. The executable’s entry point point to different location, in the packed version it directs to the unpacking stub rather than the original code. Nonpacked executables are loaded directly by the operating system. In contrast, for packed programs, the OS loads a small unpacking stub first, which then loads the original program. The stub is usually minimal and its sole purpose is to unpack the original program. Static analysis of a packed file only analyzes the stub, not the original code.
The unpacking stub performs three main tasks:
- unpack (load) the original executable into memory
- resolve its imports
- transfer execution to the original entry point
Loading in Memory
The process to load in memory it’s basically the same, but differ from a single thing, who/what loads in memory needed code. When not-packed executables are loaded in memory, the loader reads the PE header from the disk and allocate memory for each section, then it transfers the sections into the allocated memory areas. For packed executables, the PE header is also structured to ensure the loader allocates space for the sections, which may either come from the original program or be generated by the unpacking stub. The unpacking stub decompresses/decrypts the code for each section and places it into the designated memory locations. The key difference is that, for packed executables, the loader initially loads a small unpacking stub instead of the original code. This stub is responsible for unpacking the actual executable code into memory, resolving imports, and then transferring control to the original program’s entry point.
Resolve Imports
The Windows loader cannot interpret import information when the executable file it is packed, the unpacking stub is responsible for resolving imports.
Typically, the stub imports only the essential functions LoadLibrary
and GetProcAddress
. After unpacking the original program, it reads the original import table, calls LoadLibrary
to load each required DLL, and then uses GetProcAddress
to locate each imported function.
Restore the OEP
Another important and crucial task that is performed during sample unpacking is the restore of the Original Entry Point. To do so, the sample must transfer execution to the OEP, and the best instruction that transfer execution to another portion of code is jump
. This instruction is so called “tail jump” and it is the most important thing that an analyst must recognize and intercept in order to unpack a packed malware.
This pseudocode highlights how the packer hides the malicious code until it is unpacked in memory during execution. The crucial steps include the stub that loads the original code into memory, resolves its imports, and finally transfers control to the Original Entry Point (OEP) through a “tail jump”.
// Phase 1: Operating System Loading of the Packed Executable
FUNCTION LoadAndExecuteProgram(ExecutableFilePath)
IF ExecutableFile IS **packed** THEN
// The executable's entry point directs to the **unpacking stub** rather than the original code
LoadUnpackingStubIntoMemory()
ExecuteUnpackingStub()
ELSE
// **Non-packed executables** are loaded directly by the operating system
LoadOriginalExecutableIntoMemory()
TransferExecutionToOriginalEntryPoint()
END IF
END FUNCTION
// Phase 2: Execution of the Unpacking Stub
FUNCTION ExecuteUnpackingStub()
// The unpacking stub performs:
// 1. Unpack (load) the original executable into memory
// 2. Resolve its imports
// 3. Transfer execution to the original entry point
// Task 1: Unpack (load) the original executable into memory
1. ReadPEHeader() // The PE header is structured to ensure the loader allocates space for sections
2. FOR EACH Section IN OriginalExecutable
**DecompressOrDecryptCode**(Section) // The unpacking stub decompresses/decrypts the code for each section
PlaceCodeIntoAllocatedMemory(Section) // and places it into the designated memory locations
END FOR
// Task 2: Resolve Imports
// The Windows loader cannot interpret import information when the executable file is packed
// The unpacking stub is responsible for resolving imports
3. ImportEssentialFunctions(**LoadLibrary**, **GetProcAddress**) // Typically, the stub imports only these essential functions
4. ReadOriginalImportTable() // After unpacking the original program, it reads the original import table
5. FOR EACH RequiredDLL IN OriginalImportTable
Call(**LoadLibrary**, DLLName) // To load each required DLL
FOR EACH ImportedFunction IN DLL
Call(**GetProcAddress**, DLLAddress, FunctionName) // To locate each imported function
END FOR
END FOR
// Task 3: Transfer execution to the Original Entry Point (OEP)
// This is an important and crucial task performed during sample unpacking
6. IdentifyOriginalEntryPoint(**OEP**) // Identify the Original Entry Point
7. PerformJump(OEP) // This is the instruction so-called "tail jump"
// The one that we will look for
// in order to unpack a packed malware
END FUNCTION
// Phase 3: Execution of the Original Program
// At this point, control has been transferred to the unpacked original program.
// The original program can now be analyzed
How Identify Packed Samples
When malware is packed, analysts usually only have access to the packed file and cannot directly examine the original unpacked program or the tool that packed it. To unpack the executable, analysts must reverse the packing process, which requires a thorough understanding of how the specific packer functions. This knowledge is essential to effectively restore the malware to its original form for analysis.
Packed Software Indicators
if someone of this are highlighted looking at a suspicious sample, it may be a packed one.
- Bin sections with strange names;
- Huge gap between the raw size and the virtual size of the bin sections;
- Highlighted the presence of a packed sample by debuggers and tools such as PEiD;
- Few readable strings, sometimes none;
- Few imports, sometimes only
LoadLibrary
andGetProcAddress
;
Automated Unpacking
Unpack a packed software it can be done by related unpackers. Most are open source and community maintained. De4dot (e.g.) is a tool used for this purpose, deobfuscating/unpacking .NET samples. Unpac.me is another online tool that can be used to automate unpacking process. Simply giving to the unpacker software the packed one and voilà, the unpacking process is done. Simple as boring, no juicy example here. But tools and open source unpackers don’t work always, skip to the next section to understand why.
Manual Unpacking
Sometimes (more often than we think), unpack malwares is not simple and automatic as we hope, so it must be necessary try to unpack manually. Malware creators have a wide range of packers to choose from. Some packers are legitimate commercial tools designed for regular software developers, while others are specifically developed to aid malicious software. Some malware is protected by custom packers, for this reason its important understand how a packer works in order to reverse what it did on the malware and extract the evil inside of it.
Common procedure:
- find tail
jump
or juicy instructions and reach the “real” OEP; - dump the evil within;
- repair IAT;
- restore OEP;
Let’s look at different ways to do manual unpacking of a malware packed, starting from something simple to something more complex.
Specimen n1
🦠🔍 Last jump before a lot of zer0s
Let’s start to manually unpack a malware that is packed with UPX, which is free, common and relatively simple to unpack. UPX includes its own unpacking functionality, so if malware is compressed with a standard version of UPX, we can use the UPX utility to extract the original file. However, malware developers sometimes modify UPX or alter the header of the packed executable to hinder unpacking. Unlike UPX, most other packers lack integrated unpacking features, making it crucial to know how to handle various types of packed software.
By loading the sample into SpeakEasy we can see that the presence of the UPX packer (modified version) is identified, if we tried to use UPX to unpack it, we would fail. Manual unpacking is therefore necessary.
UPX packer is famous to unpack packed code after a jump instruction right before some null values:
Viewing the contents of the memory segment before the execution of the jmp
instruction (our tail jump) we notice that it is not populated
upon hitting bpx setted we notice that the memory segment to which our tail jump points is now full of instructions, probably (it is) our binary now loaded (unpacked) correctly into memory
we indeed observe a function epilogue (highlighted), aimed to save the stack pointer.
Inspecting function calls present the memory region we are in we have several pieces of information that lead us to think we are correctly in the now unpacked portion of the binary code. Otherwise we would not have so many calls, especially not so clearly readable ones.
Now that we have the unpacked sample loaded into memory, we need to dump it and restore IAT and OEP. How? with the plugin OllyDumpEx we can feed it the binary dump and restore the OEP by matching it to the instruction pointer.
then, with another tool named Scylla we can restore the IAT and fix the process dumped
Once the 3 steps are done: unpack the binary (1) and restore OEP (2) an IAT (3), we can now start to analyze the sample as we want. Comparing the two different samples, the version packed and the one unpacked, we can notice visible differences, highlighting clear information in the unpacked version.
the morphology of the binary sections is an indicator (in this case) of a packed sample, in addition to the fact that we have a section called UPX2, we can see the large gap between raw size and virtual size of the bin sections, also an indicator of a packed sample. In the version of the sample following the unpacking process we can see how this discrepancy is diminished.
As indicated at the beginning, the unpacking of this sample was not complicated. We continue by experimenting with another type of packer packed sample.
Specimen n2
🦠🔍 Let’s jump out of section boundaries
This could be classified as “easy,” but since you need a particular tool to make it “super easy” (which I do not have), the process explained below will instead follow a “less easy” path.
as we can see there is a huge discrepancy between the raw size and the virtual size of the sections, also the second section has a high entropy value, another indicator to watch out for.
What are section boundaries?
Section boundaries refer to the in-memory start and end addresses of a specific section within a Portable Executable (PE) file, such as .text
, .data
or .rsrc
. Each section of a PE file contains code, data or resources and has a defined location and size when the executable is loaded into memory.
This unpacking process can be completed automatically and quickly by using an OllyDump plugin or using Trace over/into with break condition that identifies when there is an instruction pointing to a section “out of section boundaries” of the section that EIP is currently in.
Let’s breakdown the concept:
Executables are divided into sections (e.g., .text
, .data
), each with defined memory boundaries. These boundaries are calculated as:
- Start: Section Virtual Address (VA) + Image Base
- End: Start + Section Virtual Size
if you googling how to find EOP by section hop, you may encounter this forum (https://forum.exetools.com/showthread.php?t=18603) that talks about this particular thing. Summarizing what is present inside of the forum, we can assume that out0
and out1
are dynamically determined based on the current section where the instruction pointer resides. Here’s how to interpret them:
out0
: The start address of the current PE section in memory.out1
: The end address of the current PE section in memory.
These values are derived from the Portable Executable (PE) structure of the analyzed binary. For example, if the current section is .text
, out0
and out1
define its memory boundaries.
Based on the image above, let’s assume some calculation:
- Section Start (
out0
):
out0 = Image Base + Section Virtual Address (VA)
- The
Image Base
is the address where the executable is loaded in memory (e.g.,0x0000
). - The
Section VA
is the relative address of the section within the PE file (e.g.,.SECTION Y
might have a VA of0x0201
).
- The
- Section End (
out1
):
out1 = out0 + Section Virtual Size
- The
Virtual Size
specifies how much memory the section occupies after being loaded (e.g.,0x0099
bytes).
- The
So, if the .SECTION A
section has:
- Image Base:
0x0000
- Section VA:
0x0201
- Virtual Size:
0x0099
Then: out0 = 0x0000 + 0x0201 = 0x0201
out1 = 0x0201 + 0x0099 = 0x0300
Example and explanation were only for demonstration purposes and not related to our example, but these are our section boundaries.
Looking at our sample in the debugger, we then notice that EIP points to the address 0x405000
By switching under the memory map we can see the section boundaries of the memory section where our EIP resides (which is currently also the OEP)
Based on our sample, we can say:
- Image Base:
0x405000
- Section VA:
0x000000
- Virtual Size:
0x001000
Then: out0 = 0x405000 + 0x000000 = 0x405000
out1 = 0x405000 + 0x001000 = 0x406000
In unpacking scenarios, malware decrypts its payload into a new memory section. By tracing until EIP leaves the current section (e.g., a packed .text
section), we can identify the Original Entry Point (OEP) of the unpacked code.
Based on this we can assume that if the OEP jump over the condition eip < out0 || eip >= out1
maybe we have encountered our tail jump.
Without the OllyDump function that find automatically section hops, we can set a trace over with this particular condition and wait till the EIP reach another section out of our boundaries.
In this way we reached the (unpacked) binary loaded in memory.
Let’s take a look to the Memory Map and see where we are now. Once loaded the binary into the disassembler, EIP was between the section buondaries 0x405000
-0x406000
.
the tail jump pointed to this memory reason that was not populated before the code execution (see below). We can then add another unpacking technique to our deck: identify “other memory section jump.”
Comparing the differences between the two samples we can notice several things, including the imported functions. In the packed sample (right circled in red) we can notice the presence of functions such as LoadLibraryA
and GetProcAddress
, known to be used by the unpacking stub to do unpacking of the actual code.
Once dumped unpacked code and restored IAT
and OEP
we can now see the presence of other functions within the imports, helping us to understand the capabilities of the sample under analysis.
With the correct tool, jump to another memory section than the one where the OEP
(of the packer) is, they can be identified with speed and simplicity. This is not always the case, so it is necessary to intercept these “tail jumps” with trace conditions and the right breakpoint. Of course, it is not necessarily the case that the unpacking stub will unpack the code on the first jump to another section of memory “over the OEP boundaries,” but it is potentially likely.
Specimen n3
🦠🔍 Follow Juicy-Assembly-Instruction down the white rabbit unpacking hole
In this case we do not have zeros after a jump, nor do we have jumps to sections of memory outside the limits of the section in which the OEP is located.
The sample looks like this, so it is not easy to identify the real OEP
. What we can use to our advantage, however, are the pushfd
and pushad
instructions.
What do pushfd
and pushad
instructions?
pushad
pushes all 8 general-purpose 32-bit registers onto the stack in a specific order: EAX, ECX, EDX, EBX, ESP, EBP, ESI, and EDI. This effectively saves the entire CPU register state at that moment;pushfd
pushes the EFLAGS register (which contains the CPU flags) onto the stack, preserving the processor’s flag state.
So, what our next move will be, set a breakpoint somewhere? but where exactly? Let’s try to crack the unpacking process and figure out where to set our next breakpoint.
Most packers use pushad
near the entry point of the unpacking stub to save the CPU state before starting the unpacking/decryption routine. This is because the unpacking process modifies registers heavily, but the original register values are needed later to restore the program’s state before jumping to the original, unpacked code (our real OEP).
So, describing what is the unpacking process that this kind of packed malware will follow:
pushad
saves all registers on the stack andpushfd
saves all the flags.- The unpacking routine runs, modifying registers and memory.
- The registers are restored later with a corresponding
popad
andpopfd
returning the CPU to the state it had before unpacking. - The program then jumps to the OEP.
It’s like what functions do when we are in function epilogue, where ESP state is saved and then restored before return to the function caller.
It’s time to open our friendly debugger. What we can do is to identify the corresponding memory region following the execution of the two instructions we just explained, and then set a breakpoint so that we can observe when it is potentially hits near our OEP. The register we need to consider in this case is ESP, I’m not going to explain why (google it!), by following it in memory we can identify the memory region to which the stack points, so if we understood what was described earlier, this is where we are interested in observing potential state/value changes. More than changes, however, are potential accesses to this memory region; we know that these two instructions are used to save the stack states and register flags to be resumed shortly after the unpacking process is completed.
Now we are right where we want to be, we have to set a breakpoint at the address in the memory region where we are now, but not just a simple breakpoint, a hardware breakpoint at memory access. What does this mean? unlike the normal breakpoint that is used when debugging malware where the “INT 3
” instruction is simply entered, the hardware breakpoint turns out to be much safer and more accurate because a dedicated registers for HWBPXs is used and the “on the fly” code is not changed, compromising effectiveness the moment the memory region involved undergoes considerable changes.
The dedicated register takes the desired value. (Sound familiar? otherwise go read how these registers are exploited by malware to perform defense evasion, but more importantly how they can be exploited to bypass AMSI).
Following the malware run, we hit the HWBPX as the affected memory region is “touched”:
Returning to the function caller we have our malware unpacked and ready to be dumped, restoring IAT and OEP.
We note the obvious difference between the track sections (right following manual unpacking)
Imports much more obvious and clear
Specimen n4
🦠🔍 Eat, Sleep, Breakpoint Hit, Repeat
For this unpacking method we will simply let the malware do what it wants to do and observe its behavior to identify the exact moment when it is unpacking the malicious code in memory.
Identifying the exact moment is not simple at all, what we can do is rely on the unpacking process by closely observing API calls such as LoadLibrary
and all its variations.
We then put a breakpoint (sw is fine too) to all calls to LoadLibrary
and variations, and run the malware.
On the first hit we may have a situation similar to the one below:
We then set a HW BPX (yes, again in memory - access) at the offset value in ESP, this operation, as with the previous sample, is needed since we are interested in verifying exactly when the selected memory region is touched.
We then note that DR0
takes the value of the affected offset.
hint: corresponds to the popfd
instruction, so to restore registers as seen for the previous sample.
Let’s run the malware and have it execute without changing its behavior, we notice that our breakpoint is hitted several times, each time via LoadLibrary
and GetProcAddress
the address of the loaded library functions is obtained-the malware is rebuilding the IAT.
HW BPX hit #2
HW BPX hit #30
HW BPX hit #6843
HW BPX hit #8712348
our sample loads 3 more .dll’s into memory, which is why the breakpoint (which you can delete) on the call to LoadLibrary
is still hittled.
After a series of hits (precisely 105), we see the light, our unpacked code
Indeed, we can see that the two versions (packed and unpacked) , according to the parameters that can be used to easily intercept the difference between a packed malware from an unpacked one, are very obvious and different:
One major challenge is pinpointing the exact moment when the original malicious code is fully unpacked in memory, unpacking malware using dynamic monitoring techniques such as setting memory access breakpoints on ESP after a LoadLibrary
call enables to bypass packing and obfuscation layers that are specifically designed to evade simpler automated tools, and can be a painful and labor-intensive process, but it remains necessary for thorough analysis.
This dynamic approach is slow and can be prone to errors, requiring multiple breakpoints, careful tracking of memory allocations, and significant manual effort to follow the unpacking routine through its stages, however, despite its complexity and the time consumption involved, this hands-on process is often needed.
THE END
till now
The process of malware unpacking is an essential preliminary step in malware analysis that enables security researchers to reveal the underlying malicious payload and plays a crucial role in reversing the obfuscation techniques employed by malware authors to evade detection. Malware often arrives packed or encrypted, hiding its true code and behavior from static analysis tools, this is why is so important how to unpack them and highlight its real nature.
Combining manual expertise with automated techniques enhances precision and efficiency, helping to combat increasingly sophisticated packing and evasion methods used by malware authors.
RAM boost
Add some RAM banks to your workstation listening this Spotify playlist during the reading / unpacking!
If you like this post, please follow me on Xitter and remember, 5haring is caring!