Kagura-StackWalker

.pdata Stack Frame Reconstruction Engine

Kagura (神楽): sacred dance of the Shinto gods.
The stack walk is a dance the EDR performs frame by frame.
Kagura teaches you to replicate that choreography perfectly.

Live reference tool for malware developers and red teamers. Loads Windows system DLLs, parses .pdata sections, decodes every UNWIND_INFO, and reconstructs the exact stack frame layout for any function. So you know precisely what your fake stack needs to look like.

Why Kagura?

I built Kagura because I needed to see what the stacks I was trying to reconstruct actually looked like. While developing indirect syscall and callstack spoofing techniques, I spent hours in WinDbg running .fnent, parsing dumpbin /unwindinfo output, and doing manual hex arithmetic to figure out frame sizes and register offsets. Every mistake meant a dead process with no clear error message from CrowdStrike.

The problem is simple: when you forge a fake stack, you need to know the exact layout that the EDR's stack walker expects. CrowdStrike Falcon, Microsoft Defender for Endpoint, SentinelOne: they all replay the .pdata unwind codes frame by frame to validate every return address, every saved register, every RSP alignment. Get a single slot wrong and your stack gets flagged.

Kagura makes that layout visible. It loads the target modules, decodes every RUNTIME_FUNCTION and its UNWIND_INFO, and shows you the precise slot-by-slot layout with forge instructions. For every function. Across 7 system DLLs. In a searchable interactive TUI.

No more guessing. No more manual parsing. No more trial and error.

Screenshots

  KAGURA-STACKWALKER v1.0.0                          [7 modules | 20718 funcs]

  kernelbase.dll!CreateFileA (0x000045AB0)
  Size: 0x80 (128B) | Prolog: 31B | Unwind codes: 12 | Chain depth: 0
  ============================================================================

  RSP+0x0098 | Saved R12 (MOV @0x98)
  RSP+0x0090 | Saved RDI (MOV @0x90)
  RSP+0x0088 | Saved RSI (MOV @0x88)
  RSP+0x0080 | Saved RBX (MOV @0x80)
  RSP+0x0078 | Return address (@0x78)              <- caller RIP
  RSP+0x0070 | Saved RBP (PUSH)
  RSP+0x0068 | Saved R14 (PUSH)
  RSP+0x0060 | Saved R15 (PUSH)
  RSP+0x0000 | Local/Shadow (0x60)
  RSP+0x0000 | [current RSP]

  +-- Registers saved ----------------------+
  | R12    @ RSP+0x0098  (save)             |
  | RDI    @ RSP+0x0090  (save)             |
  | RSI    @ RSP+0x0088  (save)             |
  | RBX    @ RSP+0x0080  (save)             |
  | R15    @ RSP+0x0060  (push)             |
  | R14    @ RSP+0x0068  (push)             |
  | RBP    @ RSP+0x0070  (push)             |
  +-----------------------------------------+

  Exception handler: No

Each slot shows:

Offset from RSP: exact position in the frame
Content type: return address, saved register, local/shadow space, XMM save
Save method: PUSH vs MOV (critical for accurate fake stack reconstruction)

Installation

Prerequisites

Windows 10 1809+ or Windows 11 (x64 only)
Visual Studio 2022 with the "Desktop development with C++" workload
CMake 3.20+ (included with Visual Studio 2022)
Git with submodule support

Step 1: Clone the repository

git clone --recursive https://github.com/En3nr4/Kagura-StackWalker.git
cd Kagura-StackWalker

If you already cloned without --recursive, initialize submodules manually:
git submodule update --init --recursive

Step 2: Open a Developer PowerShell

Open "Developer PowerShell for VS 2022" from the Start menu. This ensures cmake and cl.exe are in your PATH.

Alternatively, add CMake to your current PowerShell session:

$env:PATH += ";C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin"

Step 3: Build

Release build (recommended):

cmake --preset release
cmake --build build/release --config Release

The binary is at build\release\Release\kagura.exe.

Debug build:

cmake --preset debug
cmake --build build/debug --config Debug

Step 4: Run

.\build\release\Release\kagura.exe

You can copy kagura.exe anywhere: it's a single standalone binary with no external dependencies (CRT is statically linked).

Usage

Interactive TUI (default)

kagura.exe

Launches the interactive terminal UI. Navigate modules, search functions, inspect frame layouts.

CLI Batch Mode

# Dump all frames from ntdll to console
kagura.exe --dump --module ntdll.dll

# Dump a specific module
kagura.exe --dump --module kernel32.dll

# Export a specific function to JSON
kagura.exe --export json --module ntdll.dll --func NtCreateFile -o ntcreate.json

# Export entire module to JSON
kagura.exe --export json --module ntdll.dll -o ntdll_frames.json

# Export all modules to JSON
kagura.exe --export json -o all_frames.json

# Show help
kagura.exe --help

TUI Navigation

Key	Action
`UP` / `DOWN`	Navigate list
`Enter`	Select module / Open function detail
`/`	Search (incremental, case-insensitive)
`Esc`	Back / Cancel search
`Backspace`	Back to previous view
`PgUp` / `PgDn`	Scroll fast (20 items)
`e`	Export current frame to JSON file
`q`	Quit

Search Tips

Type /NtCreate to find all NtCreate* functions
Type /0x followed by an RVA to search by address
Search is case-insensitive and matches anywhere in the function name
Press Esc to clear the search filter
Press Enter to keep the filter active and navigate results

Modules Loaded

Kagura automatically loads and analyzes these system DLLs at startup:

Module	Typical Use in Maldev
`ntdll.dll`	Syscall stubs, Nt/Zw functions, Rtl* utilities
`kernel32.dll`	Win32 API layer, BaseThreadInitThunk (terminal frame)
`kernelbase.dll`	Actual implementations behind kernel32
`user32.dll`	Window/message functions
`win32u.dll`	Win32k syscall stubs
`advapi32.dll`	Security, registry, service functions
`msvcrt.dll`	C runtime (common in legitimate stacks)

Understanding the Output

Frame Layout

The frame layout shows the exact memory layout of a function's stack frame as seen by RtlVirtualUnwind. This is what the EDR validates during a stack walk.

RSP+0x0078 | Return address (@0x78)        <- Where the caller's RIP is stored
RSP+0x0070 | Saved RBP (PUSH)              <- push rbp instruction
RSP+0x0068 | Saved R14 (PUSH)              <- push r14 instruction  
RSP+0x0060 | Saved R15 (PUSH)              <- push r15 instruction
RSP+0x0000 | Local/Shadow (0x60)           <- sub rsp, 0x60
RSP+0x0000 | [current RSP]                 <- Stack pointer after prologue

PUSH vs MOV

This distinction is critical for fake stack construction:

PUSH: The register was saved via push reg. The offset is sequential (each push adds 8 bytes to the frame).
MOV (save): The register was saved via mov [rsp+offset], reg. The offset is explicit and can be anywhere in the frame.

When forging a stack, PUSH-saved registers must appear in the correct sequential order, while MOV-saved registers just need to be at the right offset.

Syscall Stubs

Functions like NtCreateFile, NtAllocateVirtualMemory etc. in ntdll are syscall stubs with minimal or zero unwind codes. Their frame is typically just 8 bytes (return address only). This is correct: these functions do mov r10, rcx; mov eax, SSN; syscall; ret with no stack frame setup.

JSON Output Format

{
  "module": "kernelbase.dll",
  "base_address": "0x7FFE7F200000",
  "function": {
    "name": "CreateFileA",
    "display_name": "CreateFileA (0x000045AB0)",
    "rva": "0x000045AB0",
    "size": 128,
    "prolog_size": 31,
    "frame": {
      "total_size": 128,
      "return_addr_offset": "0x78",
      "frame_register": null,
      "frame_reg_offset": 0,
      "slots": [
        {"offset": "0x0", "size": 8, "type": "LOCAL_ALLOC", "label": "Local/Shadow (0x60)"},
        {"offset": "0x60", "size": 8, "type": "SAVED_REG", "label": "Saved R15 (PUSH)"},
        {"offset": "0x68", "size": 8, "type": "SAVED_REG", "label": "Saved R14 (PUSH)"},
        {"offset": "0x70", "size": 8, "type": "SAVED_REG", "label": "Saved RBP (PUSH)"},
        {"offset": "0x78", "size": 8, "type": "RETURN_ADDR", "label": "Return address (@0x78)"},
        {"offset": "0x80", "size": 8, "type": "SAVED_REG", "label": "Saved RBX (MOV @0x80)"},
        {"offset": "0x88", "size": 8, "type": "SAVED_REG", "label": "Saved RSI (MOV @0x88)"},
        {"offset": "0x90", "size": 8, "type": "SAVED_REG", "label": "Saved RDI (MOV @0x90)"},
        {"offset": "0x98", "size": 8, "type": "SAVED_REG", "label": "Saved R12 (MOV @0x98)"}
      ],
      "saved_registers": [
        {"reg": "R12", "offset": "0x98", "method": "MOV"},
        {"reg": "RDI", "offset": "0x90", "method": "MOV"},
        {"reg": "RSI", "offset": "0x88", "method": "MOV"},
        {"reg": "RBX", "offset": "0x80", "method": "MOV"},
        {"reg": "R15", "offset": "0x60", "method": "PUSH"},
        {"reg": "R14", "offset": "0x68", "method": "PUSH"},
        {"reg": "RBP", "offset": "0x70", "method": "PUSH"}
      ],
      "unwind": {
        "version": 1,
        "flags": 0,
        "code_count": 12,
        "has_exception_handler": false,
        "has_termination_handler": false,
        "chain_depth": 0
      }
    }
  }
}

Technical Details

How It Works

Module loading: Each DLL is loaded via LoadLibraryExW with DONT_RESOLVE_DLL_REFERENCES (no DllMain execution, no import resolution)
Export parsing: The PE export table is walked to resolve function names. Exports are sorted by RVA for nearest-name binary search.
.pdata parsing: The exception directory (DataDirectory[3]) contains the RUNTIME_FUNCTION array. Each entry maps a function's RVA range to its UNWIND_INFO.
Frame reconstruction: Each UNWIND_INFO's UNWIND_CODE array is replayed to determine the exact stack frame layout. All 10 unwind opcodes are supported.
CHAININFO: Chained unwind info is followed recursively (up to 32 levels deep).
Display: Results are indexed in memory and presented via an ANSI/VT100 TUI or exported as JSON.

Supported UNWIND_CODE Operations

OpCode	Operation	Frame Effect
0	`UWOP_PUSH_NONVOL`	`push reg`: RSP -= 8
1	`UWOP_ALLOC_LARGE`	`sub rsp, N`: large allocation
2	`UWOP_ALLOC_SMALL`	`sub rsp, (info*8)+8`: small allocation
3	`UWOP_SET_FPREG`	Set frame pointer register
4	`UWOP_SAVE_NONVOL`	`mov [rsp+off], reg`: save at explicit offset
5	`UWOP_SAVE_NONVOL_FAR`	Same, 32-bit offset
8	`UWOP_SAVE_XMM128`	Save XMM register (16 bytes)
9	`UWOP_SAVE_XMM128_FAR`	Same, 32-bit offset
10	`UWOP_PUSH_MACHFRAME`	CPU trap frame (interrupts)

Architecture

kagura-stackwalker/
├── include/
│   └── kagura.h            # Shared types and constants
├── src/
│   ├── main.c              # Entry point, CLI parsing
│   ├── kagura_data.c       # Global constant arrays
│   ├── pe_parser.c/.h      # PE loading, export parsing, .pdata extraction
│   ├── frame_recon.c/.h    # UNWIND_INFO decoding, frame reconstruction
│   ├── module_db.c/.h      # Module database, search indexing
│   ├── tui.c/.h            # ANSI/VT100 rendering engine
│   ├── tui_views.c/.h      # Module list, function list, frame detail views
│   ├── output.c/.h         # JSON and console export
│   └── utils.c/.h          # String utilities
├── external/
│   └── zydis/              # Zydis disassembler (git submodule)
├── CMakeLists.txt
└── CMakePresets.json

Roadmap (v2)

Gadget finder: score functions for stack spoofing compatibility
Chain builder: automatically assemble multi-frame fake stacks
Stack forge generator: output ready-to-use C structs / ASM
Walk simulator: validate forged stacks against EDR-style checks
Prologue disassembly via Zydis in frame detail view
Export to C headers and MASM/NASM
Custom module loading via CLI argument

Dependencies

Zydis: Fast x86/x64 disassembler library (MIT license, included as git submodule)

License

MIT

Credits

ENENRA

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
docs		docs
external		external
include		include
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
CMakePresets.json		CMakePresets.json
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kagura-StackWalker

Why Kagura?

Screenshots

Installation

Prerequisites

Step 1: Clone the repository

Step 2: Open a Developer PowerShell

Step 3: Build

Step 4: Run

Usage

Interactive TUI (default)

CLI Batch Mode

TUI Navigation

Search Tips

Modules Loaded

Understanding the Output

Frame Layout

PUSH vs MOV

Syscall Stubs

JSON Output Format

Technical Details

How It Works

Supported UNWIND_CODE Operations

Architecture

Roadmap (v2)

Dependencies

License

Credits

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Kagura-StackWalker

Why Kagura?

Screenshots

Installation

Prerequisites

Step 1: Clone the repository

Step 2: Open a Developer PowerShell

Step 3: Build

Step 4: Run

Usage

Interactive TUI (default)

CLI Batch Mode

TUI Navigation

Search Tips

Modules Loaded

Understanding the Output

Frame Layout

PUSH vs MOV

Syscall Stubs

JSON Output Format

Technical Details

How It Works

Supported UNWIND_CODE Operations

Architecture

Roadmap (v2)

Dependencies

License

Credits

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages