Methods for Memory Scanning

December 2nd, 2024: Tamir Vered

Introduction

Memory Scanning is a key method in both static and dynamic analysis of a system's state. It is widely used in a variety of forensic research cases, including malware analysis.

In this article, we describe methods for performing such memory scanning on a given system, in both static (memory dump) and dynamic (debugging/monitoring) contexts, through Physical Memory Scanning and Virtual Memory Scanning, and compare the pros and cons of each method in terms of Performance, False Alarms and Misdetections, as well as comparing the Limitations of the methods in question.

Motivation

Consider the following situation, illustrated in a Windows environment. A symbol called nt!PsActiveProcessHead points to the head of a doubly-linked list of the processes in the system. Each process, is represented by the EPROCESS struct, and holds a LIST_ENTRY struct member named ActiveProcessLinks that makes it an element of the list:


A malicious process, that manages to exploit a vulnerability and gets write access to kernel memory can detach itself from the nt!PsActiveProcessHead list, rendering it invisible to the Task Manager, or to any other utility that monitors that list:


It is important to note that the fact that the malicious process is no longer contained by the aforementioned list, on its own - does not mean that the scheduler of the operating system ignores that process, which means that it can run in the background unnoticed. Since the malicious process still resides in the RAM, we can find it by means of memory scanning.

Virtual Memory

Before we dive into comparing methods for physical and virtual memory scanning, it is important to understand the concept of virtual memory, and how it is implemented in modern operating systems. Virtual memory is a memory management technique, that provides abstraction from the physical resources, by means of managing separate address "maps" for each virtual address space. Common virtual address spaces are Kernel Virtual Address Space and a separate address space for each process.

As can be seen in the figure below:

  1. Different virtual addresses (from the same, or from different virtual address spaces) can point to the same physical resource.
  2. Consecutive virtual addresses can point to non-consecutive physical resources.

This abstraction enables Process Encapsulation, Physical Resource Independency as well as Paging/Swapping (that is, temporarily offloading volatile memory to a persistent storage device).


Translation of virtual addresses to a physical one is a feature built into the CPU, that is implemented by "crawling" a structure tree that is called the Page Tables (note: the address resolution process described hereafter is correct for a standard 64-bit 4-level paging scheme that is found in x86-64 CPUs).

The Page Tables


Let us follow the normal case: The CR3 control register contains a Page Frame Number (PFN), that is a physical address from which the lower 12 bits are trimmed. The PFN stored in CR3 points to a Page Table of the first hierarchy (out of the 4-level paging scheme), called PML4. It is unique between different processes and the kernel. Next, bits 47-39 of the 64-bit virtual address specify which of the PML4 table's 512 entries should contain the physical pointer to the table of the next hierarchy (That is called Page Directory Pointer table, or PDP table). The pointer is stored, once again, as a PFN.


Now, next 9 bits (bits 38-30) of the original virtual address specify the PDP entry that will point to the next hierarchy. The process continues pretty much the same for two more hierarchies, the Page Directory (PD) and the Page Table (PT). This leaves us with bits 11-0 of the virtual address, that specifies the offset, inside the final 4KiB Page that was resolved from the PT entry, that the virtual address refers to.


The entire process is depicted in the figure below for the virtual address 0xFFFFB91DC907F6DF:


There are two common flows that can deviate from the normal case:

  1. Page Faults: Table-entries also contain flags. In any hierarchy, the entry being crawled can be marked as swapped-out (in which case, the operating system should handle the thrown Page Fault and retrieve the swapped page back from the storage device), or not present at all.
  2. Large Pages: An entry can also be flagged for being a Large Page. In this case, the entity pointed to by the entry will not be the next hierarchy page table, rather a large physical range, for which the relevant offset will be specified by the rest of the bits of the virtual address. For example, if a PD Entry is flagged as a Large Page, then the entry will point to a 2MiB physical page, and the relevant offset in that page will be specified by bits 20-0 of the virtual address.

Physical Memory Scanning


In the case of physical memory scanning, we iterate the entire physical memory (either statically, on a given memory-dump, or dynamically from driver code or through a debugger) and look for patterns that fit for the objects that we are looking for.

For example, in the hiding process scenario from before, we would like to scan for EPROCESS objects. This structure is allocated with a header containing a Pool Tag - a marker that can be used for debugging. The pool tag with which the EPROCESS objects are allocated are "Proc" in newer versions of Windows, and "Pro\xE3" in older versions. The EPROCESS structure also contains an ImageFileName field, containing the executable name. This pattern is something we can look for in the physical memory in order to find EPROCESS objects.

In the figure below we can see hex-view of a full physical memory dump, in which a pattern that matches an EPROCESS can be found:


Doing such a scan statically is one thing, but accessing "live" physical memory from code is more challenging, mainly since pointers in code are virtual by nature (which is, as mentioned before, implemented by the CPU). So, how can we access physical memory? One way could be performing a read operation on the physical memory device through an API provided by the OS. Another could be mapping a virtual address to a physical one, and accessing the memory through the virtual address (again, via OS provided API). Both ways can be quite slow when iterating large memory ranges. Moreover, the second method could have a significant memory footprint since it alters the Page Tables intensively when mapping and unmapping memory ranges, and we would like to avoid heavily affecting the examined system's memory during a forensic analysis.

Manual Page Remapping


A method utilized by the widely used physical memory acquisition tool WinPmem [1] consists of manipulating the PTE of a virtual pointer to a self-owned dummy-page, to make it point to a desired target physical address (this is sometimes referred to as "rogue page" or "rogue PTE"). This way, accessing a virtual pointer to the self-owned page will make the CPU resolve the target physical address, instead of the original one. This method works as follows:

  1. Allocate a non-paged 4KiB buffer that is page aligned.
  2. Find the PTE of the allocated buffer, as described in the Page Tables section above.
  3. Change the PFN that is pointed-to by that PTE to the target physical page.
  4. Pointer to the buffer will now access the target physical page instead of the buffer.

Consecutive Memory


During this section we overviewed the concept of physical memory scanning and how it can be used to find target objects that "float" around within the system memory. One issue with this approach is that objects are allocated by software using the virtual memory. This means that an object that we are looking for will not necessarily be consecutive in the physical memory (as depicted in the figure below).


We can resolve this issue by taking a different approach, by the means of Virtual Memory Scanning.

Virtual Memory Scanning


The virtual memory address space is potentially huge, for example a 64-bit address space could potentially contain up to 264 different addresses. This poses a challenge to our endeavor of performing the memory scan on virtual memory. In reality, not all of the address space is available, and more specifically not all of it belongs to the kernel, this implies that some assumptions could be made to minimize the range to be scanned. With that being said, since the kernel does potentially use wide ranges of the address space, such assumptions will still leave us with a large portion of the virtual address space, larger than we would like in order to be able to reasonably scan its entirety.


The solution to this problem is "crawling" the page-tables, so that only virtual memory that is actually mapped to physical memory (or paged-memory for that matter) will be scanned. A question still hangs: once we find, during the recursive iteration on the page-table hierarchies, a page that is available for scanning - how can we prevent it from "moving around" (e.g. due to swapping or memory compaction) during the scan?

  • Mapping a locked virtual page to the given one, via API provided by the OS.
  • Applying the page remapping trick once again.

Note: there is still a small chance of the scanned memory swapping-out and another chunk of memory swapping-in to the same physical address, between the iteration time and the time in which locking/remapping was performed. Still, these methods let us avoid of getting page-faults during the scan, or interfering the scanned system.

Method Comparison and Implications

Finding Cross-Page Structures

As we have seen, physical memory scanning does not guarantee virtual-continuity of the scanned memory, and therefore physical scanning may miss structures that reside across more than one page.

Finding Already-Freed Memory

Since virtual memory scanning only performs scanning on memory that is mapped at the moment of scan, once memory was freed (e.g. A process was closed and its associated EPROCESS structure was freed by the kernel) it cannot be found via virtual memory scanning, even if the structure in question still resides in the physical memory. This is in contrast to physical memory scanning, by which such structures could potentially be found.

Scan Time

Physical memory scan does not, in general, depend on current memory usage, since it has to iterate a constant-sized physical memory space (this analysis ignores swap-file scanning, if supported). Virtual memory scan time, on the other hand, is directly correlated to the amount of kernel virtual memory mapped in the system at the time of scan.

Data Analysis and Heuristics

In physical memory scan, one cannot rely on the virtual address of the scanned data itself for the purpose of heuristics, without relying on some physical-to-virtual mapping, moreover, one physical address can be mapped to multiple virtual addresses, so even such physical-to-virtual mapping will not necesarily provide a unique virtual address for the heuristics to rely on. For example, the TypeIndex field of the Windows OBJECT_HEADER structure can be used to determine the object's type, represented by the OBJECT_HEADER. In newer versions of Windows, however, this field is masked using the object's virtual address [2].


The table hereunder summarizes the comparison between the two suggested memory scanning methods:

Physical Memory Scanning Virtual Memory Scanning
Finding Cross-Page Structures Unsupported Supported
Finding Already-Freed Memory Supported Unsupported
Scan Time Not affected by memory usage Faster when memory usage is low
Heuristics Can Rely on Virtual Memory Less likely Yes
Scanning Swapped-Out Memory Requires extra work (analyzing the pagefile) Requires extra work (by analyzing the pagefile or, on a live system, by triggering a page fault)


The table hereunder shows the number of objects per type, found in a test scan of each method:

Physical Memory Scanning Virtual Memory Scanning
CMHIVE 110 95
DRIVER_OBJECT 172 132
EPROCESS 176 172
FILE_OBJECT 5836 5343
KMUTANT 738 711
LDR_DATA_TABLE_ENTRY 273 174
OBJECT_SYMBOLIC_LINK 49 285
RTL_ATOM_TABLE 41 24

For some of the types, such as OBJECT_SYMBOLIC_LINK, results are missing in physical memory scan due to virtual-address dependant heuristics. In other type, such as FILE_OBJECT, the benefits of physical memory scanning are more prominent (there can be, of course, false-positive results, but in the specific test-scan that is shown, manual overview of the results verified that the false-positive percentile is similar for FILE_OBJECTs in the two methods).

Conclusions

In this article we have reviewed two methods for scanning system memory for different objects for the purpose of debugging or forensic analysis and discussed the differences between the two methods.

It is important to keep in mind the pros and cons of each method in order to correctly choose the one that better fits the scenario in question, and to make the correct adaptation to the heuristics being used to determine a positive search result.

References

[1] https://github.com/Velocidex/WinPmem

[2] https://medium.com/@ashabdalhalim/a-light-on-windows-10s-object-header-typeindex-value-e8f907e7073a