.pdata Stack Frame Reconstruction Engine
Kagura (神楽): sacred dance of the Shinto gods.
The stack walk is a dance the EDR performs frame by frame.
Kagura teaches you to replicate that choreography perfectly.
Live reference tool for malware developers and red teamers. Loads Windows system DLLs, parses .pdata sections, decodes every UNWIND_INFO, and reconstructs the exact stack frame layout for any function. So you know precisely what your fake stack needs to look like.
I built Kagura because I needed to see what the stacks I was trying to reconstruct actually looked like. While developing indirect syscall and callstack spoofing techniques, I spent hours in WinDbg running .fnent, parsing dumpbin /unwindinfo output, and doing manual hex arithmetic to figure out frame sizes and register offsets. Every mistake meant a dead process with no clear error message from CrowdStrike.
The problem is simple: when you forge a fake stack, you need to know the exact layout that the EDR's stack walker expects. CrowdStrike Falcon, Microsoft Defender for Endpoint, SentinelOne: they all replay the .pdata unwind codes frame by frame to validate every return address, every saved register, every RSP alignment. Get a single slot wrong and your stack gets flagged.
Kagura makes that layout visible. It loads the target modules, decodes every RUNTIME_FUNCTION and its UNWIND_INFO, and shows you the precise slot-by-slot layout with forge instructions. For every function. Across 7 system DLLs. In a searchable interactive TUI.
No more guessing. No more manual parsing. No more trial and error.
KAGURA-STACKWALKER v1.0.0 [7 modules | 20718 funcs]
kernelbase.dll!CreateFileA (0x000045AB0)
Size: 0x80 (128B) | Prolog: 31B | Unwind codes: 12 | Chain depth: 0
============================================================================
RSP+0x0098 | Saved R12 (MOV @0x98)
RSP+0x0090 | Saved RDI (MOV @0x90)
RSP+0x0088 | Saved RSI (MOV @0x88)
RSP+0x0080 | Saved RBX (MOV @0x80)
RSP+0x0078 | Return address (@0x78) <- caller RIP
RSP+0x0070 | Saved RBP (PUSH)
RSP+0x0068 | Saved R14 (PUSH)
RSP+0x0060 | Saved R15 (PUSH)
RSP+0x0000 | Local/Shadow (0x60)
RSP+0x0000 | [current RSP]
+-- Registers saved ----------------------+
| R12 @ RSP+0x0098 (save) |
| RDI @ RSP+0x0090 (save) |
| RSI @ RSP+0x0088 (save) |
| RBX @ RSP+0x0080 (save) |
| R15 @ RSP+0x0060 (push) |
| R14 @ RSP+0x0068 (push) |
| RBP @ RSP+0x0070 (push) |
+-----------------------------------------+
Exception handler: No
Each slot shows:
- Offset from RSP: exact position in the frame
- Content type: return address, saved register, local/shadow space, XMM save
- Save method:
PUSHvsMOV(critical for accurate fake stack reconstruction)
- Windows 10 1809+ or Windows 11 (x64 only)
- Visual Studio 2022 with the "Desktop development with C++" workload
- CMake 3.20+ (included with Visual Studio 2022)
- Git with submodule support
git clone --recursive https://github.com/En3nr4/Kagura-StackWalker.git
cd Kagura-StackWalkerIf you already cloned without
--recursive, initialize submodules manually:git submodule update --init --recursive
Open "Developer PowerShell for VS 2022" from the Start menu. This ensures cmake and cl.exe are in your PATH.
Alternatively, add CMake to your current PowerShell session:
$env:PATH += ";C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin"Release build (recommended):
cmake --preset release
cmake --build build/release --config ReleaseThe binary is at build\release\Release\kagura.exe.
Debug build:
cmake --preset debug
cmake --build build/debug --config Debug.\build\release\Release\kagura.exeYou can copy
kagura.exeanywhere: it's a single standalone binary with no external dependencies (CRT is statically linked).
kagura.exeLaunches the interactive terminal UI. Navigate modules, search functions, inspect frame layouts.
# Dump all frames from ntdll to console
kagura.exe --dump --module ntdll.dll
# Dump a specific module
kagura.exe --dump --module kernel32.dll
# Export a specific function to JSON
kagura.exe --export json --module ntdll.dll --func NtCreateFile -o ntcreate.json
# Export entire module to JSON
kagura.exe --export json --module ntdll.dll -o ntdll_frames.json
# Export all modules to JSON
kagura.exe --export json -o all_frames.json
# Show help
kagura.exe --help| Key | Action |
|---|---|
UP / DOWN |
Navigate list |
Enter |
Select module / Open function detail |
/ |
Search (incremental, case-insensitive) |
Esc |
Back / Cancel search |
Backspace |
Back to previous view |
PgUp / PgDn |
Scroll fast (20 items) |
e |
Export current frame to JSON file |
q |
Quit |
- Type
/NtCreateto find all NtCreate* functions - Type
/0xfollowed by an RVA to search by address - Search is case-insensitive and matches anywhere in the function name
- Press
Escto clear the search filter - Press
Enterto keep the filter active and navigate results
Kagura automatically loads and analyzes these system DLLs at startup:
| Module | Typical Use in Maldev |
|---|---|
ntdll.dll |
Syscall stubs, Nt/Zw functions, Rtl* utilities |
kernel32.dll |
Win32 API layer, BaseThreadInitThunk (terminal frame) |
kernelbase.dll |
Actual implementations behind kernel32 |
user32.dll |
Window/message functions |
win32u.dll |
Win32k syscall stubs |
advapi32.dll |
Security, registry, service functions |
msvcrt.dll |
C runtime (common in legitimate stacks) |
The frame layout shows the exact memory layout of a function's stack frame as seen by RtlVirtualUnwind. This is what the EDR validates during a stack walk.
RSP+0x0078 | Return address (@0x78) <- Where the caller's RIP is stored
RSP+0x0070 | Saved RBP (PUSH) <- push rbp instruction
RSP+0x0068 | Saved R14 (PUSH) <- push r14 instruction
RSP+0x0060 | Saved R15 (PUSH) <- push r15 instruction
RSP+0x0000 | Local/Shadow (0x60) <- sub rsp, 0x60
RSP+0x0000 | [current RSP] <- Stack pointer after prologue
This distinction is critical for fake stack construction:
- PUSH: The register was saved via
push reg. The offset is sequential (each push adds 8 bytes to the frame). - MOV (save): The register was saved via
mov [rsp+offset], reg. The offset is explicit and can be anywhere in the frame.
When forging a stack, PUSH-saved registers must appear in the correct sequential order, while MOV-saved registers just need to be at the right offset.
Functions like NtCreateFile, NtAllocateVirtualMemory etc. in ntdll are syscall stubs with minimal or zero unwind codes. Their frame is typically just 8 bytes (return address only). This is correct: these functions do mov r10, rcx; mov eax, SSN; syscall; ret with no stack frame setup.
{
"module": "kernelbase.dll",
"base_address": "0x7FFE7F200000",
"function": {
"name": "CreateFileA",
"display_name": "CreateFileA (0x000045AB0)",
"rva": "0x000045AB0",
"size": 128,
"prolog_size": 31,
"frame": {
"total_size": 128,
"return_addr_offset": "0x78",
"frame_register": null,
"frame_reg_offset": 0,
"slots": [
{"offset": "0x0", "size": 8, "type": "LOCAL_ALLOC", "label": "Local/Shadow (0x60)"},
{"offset": "0x60", "size": 8, "type": "SAVED_REG", "label": "Saved R15 (PUSH)"},
{"offset": "0x68", "size": 8, "type": "SAVED_REG", "label": "Saved R14 (PUSH)"},
{"offset": "0x70", "size": 8, "type": "SAVED_REG", "label": "Saved RBP (PUSH)"},
{"offset": "0x78", "size": 8, "type": "RETURN_ADDR", "label": "Return address (@0x78)"},
{"offset": "0x80", "size": 8, "type": "SAVED_REG", "label": "Saved RBX (MOV @0x80)"},
{"offset": "0x88", "size": 8, "type": "SAVED_REG", "label": "Saved RSI (MOV @0x88)"},
{"offset": "0x90", "size": 8, "type": "SAVED_REG", "label": "Saved RDI (MOV @0x90)"},
{"offset": "0x98", "size": 8, "type": "SAVED_REG", "label": "Saved R12 (MOV @0x98)"}
],
"saved_registers": [
{"reg": "R12", "offset": "0x98", "method": "MOV"},
{"reg": "RDI", "offset": "0x90", "method": "MOV"},
{"reg": "RSI", "offset": "0x88", "method": "MOV"},
{"reg": "RBX", "offset": "0x80", "method": "MOV"},
{"reg": "R15", "offset": "0x60", "method": "PUSH"},
{"reg": "R14", "offset": "0x68", "method": "PUSH"},
{"reg": "RBP", "offset": "0x70", "method": "PUSH"}
],
"unwind": {
"version": 1,
"flags": 0,
"code_count": 12,
"has_exception_handler": false,
"has_termination_handler": false,
"chain_depth": 0
}
}
}
}- Module loading: Each DLL is loaded via
LoadLibraryExWwithDONT_RESOLVE_DLL_REFERENCES(no DllMain execution, no import resolution) - Export parsing: The PE export table is walked to resolve function names. Exports are sorted by RVA for nearest-name binary search.
- .pdata parsing: The exception directory (
DataDirectory[3]) contains theRUNTIME_FUNCTIONarray. Each entry maps a function's RVA range to itsUNWIND_INFO. - Frame reconstruction: Each
UNWIND_INFO'sUNWIND_CODEarray is replayed to determine the exact stack frame layout. All 10 unwind opcodes are supported. - CHAININFO: Chained unwind info is followed recursively (up to 32 levels deep).
- Display: Results are indexed in memory and presented via an ANSI/VT100 TUI or exported as JSON.
| OpCode | Operation | Frame Effect |
|---|---|---|
| 0 | UWOP_PUSH_NONVOL |
push reg: RSP -= 8 |
| 1 | UWOP_ALLOC_LARGE |
sub rsp, N: large allocation |
| 2 | UWOP_ALLOC_SMALL |
sub rsp, (info*8)+8: small allocation |
| 3 | UWOP_SET_FPREG |
Set frame pointer register |
| 4 | UWOP_SAVE_NONVOL |
mov [rsp+off], reg: save at explicit offset |
| 5 | UWOP_SAVE_NONVOL_FAR |
Same, 32-bit offset |
| 8 | UWOP_SAVE_XMM128 |
Save XMM register (16 bytes) |
| 9 | UWOP_SAVE_XMM128_FAR |
Same, 32-bit offset |
| 10 | UWOP_PUSH_MACHFRAME |
CPU trap frame (interrupts) |
kagura-stackwalker/
├── include/
│ └── kagura.h # Shared types and constants
├── src/
│ ├── main.c # Entry point, CLI parsing
│ ├── kagura_data.c # Global constant arrays
│ ├── pe_parser.c/.h # PE loading, export parsing, .pdata extraction
│ ├── frame_recon.c/.h # UNWIND_INFO decoding, frame reconstruction
│ ├── module_db.c/.h # Module database, search indexing
│ ├── tui.c/.h # ANSI/VT100 rendering engine
│ ├── tui_views.c/.h # Module list, function list, frame detail views
│ ├── output.c/.h # JSON and console export
│ └── utils.c/.h # String utilities
├── external/
│ └── zydis/ # Zydis disassembler (git submodule)
├── CMakeLists.txt
└── CMakePresets.json
- Gadget finder: score functions for stack spoofing compatibility
- Chain builder: automatically assemble multi-frame fake stacks
- Stack forge generator: output ready-to-use C structs / ASM
- Walk simulator: validate forged stacks against EDR-style checks
- Prologue disassembly via Zydis in frame detail view
- Export to C headers and MASM/NASM
- Custom module loading via CLI argument
- Zydis: Fast x86/x64 disassembler library (MIT license, included as git submodule)
ENENRA