smda

A minimalist recursive x86 / x64 / AArch64 disassembler library, optimized for accurate Control Flow Graph (CFG) recovery from PE, ELF, and Mach-O binaries and arbitrary memory dumps.

The output is a collection of functions, basic blocks, and instructions with their respective edges (block-to-block, function-to-function). Optionally, references to the Windows API can be inferred via the ApiScout method.

smda-rs is a Rust port of danielplohmann/smda (Python). It powers capa-rs, the Rust port of Mandiant's capability extractor.

Features

Zero-copy disassembly. BinaryInfo<'a> borrows the input bytes directly. No mapped-image allocation, no per-instruction byte clone, no DisassemblyReport.buffer.
Modern Linux ELF coverage: added GCC / clang endbr64 (F3 0F 1E FA) plus the extended GCC AMD64 prologue family (48 89 5C 24 ??, 48 83 EC ??, 41 57 41 56)$
Linux exit-syscall recognition: mov eax, 60; syscall (and exit_group / int 0x80 equivalents) now end the containing function correctly.
PE exports as candidate seeds: the export RVA list, previously only surfaced in the public report, now seeds the function-candidate scanner. Free coverage win on s$
New report fields: report.oep (original entry point VA), function.is_exported (PE only), function.stringrefs (VAs of stack-string writes — wires up the exist$
New lookups: report.find_function_by_offset(addr) / find_block_by_offset(addr).
Timeout support: Disassembler::parse_with_timeout(..., Duration) + new Error::AnalysisTimeout for batch processors of untrusted samples.
Section-table abstraction. Byte access goes through binary_info.bytes_at(va, len) -> Result<&[u8]>, which looks up the VA in a small per-binary SectionMap tabl$
Instruction slimmed down. The 0.3.x per-instruction mnemonic: String, operands: Option<String>, and bytes: String (hex) fields are gone. Use the typed iced$
Decoders are pure-Rust — iced-x86 for x86 (no C/C++ build dep, ~2–3× faster than capstone) and disarm64 for AArch64 (table-generated from the ARM ISA JSON, MIT$
Same security guards. All the checked-arithmetic, allocation caps, and bounds checks added in 0.3.0 are preserved — the pe::map_binary and elf::map_binary rewr$
Input formats: PE (32 / 64-bit), ELF (32 / 64-bit), Mach-O (Intel + ARM64, thin and fat).
Architectures: x86, x86_64, AArch64 (0.6.0+).
Function discovery: prologue scan (MSVC + GCC / clang endbr64 family + ARM64 stp x29, x30, [sp, #-N]!), call-target propagation, PE exception-handler (.pdata) seeding, PE export-table seeding.
Per-function output: basic blocks, in / out references, API calls (ApiScout — embedded Win7 + WinXP DBs), stack-string refs, block-to-block edges, is_exported, PIC + opcode hashes, dominator tree + nesting depth.
Report-level: oep, find_function_by_offset / find_block_by_offset lookups, per-disassembly timeout via parse_with_timeout.

Architecture-aware decoding (0.6.0)

The decoder lives behind a small Decoder trait with two backends:

X86Decoder — wraps iced_x86. Variable-width, 32 / 64-bit modes. Same x86 path as 0.5.x; zero behavioural change.
Aarch64Decoder — wraps disarm64. Fixed 4-byte instructions, 64-bit only. Validated at 98%+ clean memory-operand extraction on real Apple-silicon ARM64 binaries (Rust release builds, /bin/ls) before integration.

Smda decides which decoder to use. The caller passes &[u8]; smda inspects the header and routes:

ELF e_machine == EM_AARCH64 (183) → AArch64.
PE coff_header.machine == 0xAA64 → AArch64.
Mach-O cputype == CPU_TYPE_ARM64 (0x100000C) → AArch64. For fat (universal) binaries, the slice preference is configurable via SmdaConfig::macho_arch_preference: default is HostNative (picks the slice matching the host machine — ARM64 on Apple-silicon, x86_64 on Intel/AMD Linux/Windows), with explicit Aarch64First / X86_64First / X86First overrides for analysts who want consistent slice selection regardless of host.
Everything else falls through to the existing x86 32/64-bit detection.

DecodedInsn is an enum (X86(IcedInsn) / Aarch64(ArmInsn)); the typed accessors on function::Instruction (mnemonic_enum, op_kind, memory_base, flow_control, is_call, is_jmp, is_ret, format_mnemonic, format_operands, length, bytes_in, get_printable_len) keep their 0.5.x signatures and dispatch internally.

ARM64 function-discovery depth (0.6.5). What started as minimum-viable in 0.6.0 has been built out across 0.6.1–0.6.5 to feature parity with the x86 path on the analyser surfaces that matter:

Function candidates: entry point, PE exports, ELF dynamic symbols, ARM64 stp x29, x30, [sp, #-N]! prologue scan, PE .pdata packed-unwind sweep (RUNTIME_FUNCTION 32-bit + xdata pointer forms), Mach-O export trie.
Edge resolution: direct b/bl propagation, conditional branches (b.cond, cbz/cbnz, tbz/tbnz), ret / br block ends.
Indirect-call register tracking: multi-block backtrack across adrp + ldr + blr patterns including GOT-loaded jump-table bases.
Jump-table heuristic: A (Clang i32-delta), B (u64-absolute), C (JT8 byte-offset), D (JT16 halfword-offset).
Tail-call recognition: bare b to function-boundary addresses promoted via the tail-call analyser.
Exit-syscall recognition: Linux svc #0; x8 = 93 / 94 (exit, exit_group); macOS svc #0x80; x16 = 1 / 472 (_exit, exit_with_payload). MOVN / MOVK syscall-number tracking included.
NOP detection: in next_gap_candidate — ARM64 1f 20 03 d5 and the BTI / PAC NOP-aliases.
Function::is_api_thunk: 3-instruction adrp + ldr + br thunks identified.
Stack-string detection: AArch64 store-immediate sequences walked into function.stringrefs.

Mach-O API resolution (0.6.4 + 0.6.5). Bind / lazy-bind opcode stream is walked for the __DATA,__got / __DATA,__la_symbol_ptr slot VAs (covers ADRP+LDR+BLR register-indirect patterns), and LC_DYSYMTAB.indirectsymoff is walked manually against the section-header table for __TEXT,__stubs (covers direct bl _stub calls, the most common ARM64 PIC call form). Both feed disassembly.apis and addr_to_api so consumers like capa-rs see API features on Mach-O input the same way they do on PE / ELF.

x86 / x86_64 binaries are unaffected by all of the above — same code, same output as 0.5.x.

Quick start

Add to your Cargo.toml:

[dependencies]
smda = "0.6"

Then disassemble a file:

use smda::{Disassembler, SmdaConfig};

fn main() -> smda::Result<()> {
    // Load the file yourself — the report borrows from this buffer
    // for the lifetime `'a`, so it must outlive the report.
    let buf = std::fs::read("Sample.exe")?;

    // 0.5.0: positional bool args were replaced by SmdaConfig so new
    // analysis knobs land without further API breaks. Every field has
    // a sensible default; chain only what you need.
    let cfg = SmdaConfig::new()
        .path("Sample.exe")
        .high_accuracy(false)        // slower, finds more functions
        .resolve_tailcalls(false);   // promote tail-call targets to functions

    let report = Disassembler::parse(&buf, &cfg)?;

    println!("format       : {:?}", report.format);
    println!("architecture : {:?}", report.architecture);
    println!("bitness      : {}", report.bitness);
    println!("base addr    : 0x{:x}", report.base_addr);
    println!("functions    : {}", report.functions.len());

    for (addr, func) in report.get_functions()?.iter().take(5) {
        let blocks = func.get_blocks()?;
        let insns  = func.get_num_instructions()?;
        println!("  0x{:08x}  {} blocks, {} insns", addr, blocks.len(), insns);
    }
    Ok(())
}

For raw memory dumps (shellcode, unpacked modules). parse_buffer is x86 / x64 only — the caller picks bitness (32 or 64) and the buffer is decoded with iced. ARM64 shellcode needs to be wrapped in an ELF / Mach-O / PE header so the existing file-format routing can pick the AArch64 decoder; a dedicated parse_buffer_aarch64 is intentionally not in 0.6.x (it'd be a 1:1 wrapper around the same disarm64 decoder, and the file-format wrap is trivial enough that the duplicated surface didn't justify itself):

use smda::{Disassembler, SmdaConfig};
use std::time::Duration;

let shellcode: &[u8] = &[/* … */];
let cfg = SmdaConfig::new().timeout(Duration::from_secs(10));
let report = Disassembler::parse_buffer(
    shellcode,
    0x1000,     // virtual base address
    64,         // bitness (32 or 64)
    &cfg,
)?;

Typed iced accessors

Each Instruction carries the fully-decoded iced_x86::Instruction (16 bytes, Copy) and exposes typed accessors. New code should prefer these over the on-demand string formatters — no allocation, no string parsing.

use smda::function::Instruction;
use smda::BinaryInfo;
use iced_x86::{FlowControl, Mnemonic, OpKind};

fn classify(ins: &Instruction, bi: &BinaryInfo<'_>) {
    // On-demand formatting (allocates a fresh String per call —
    // cache locally if you read it more than once per instruction).
    println!(
        "{:08x}  {:7} {}",
        ins.offset,
        ins.format_mnemonic(),
        ins.format_operands().unwrap_or_default(),
    );

    // Raw instruction bytes, borrowed from the input file (zero-copy).
    if let Ok(bytes) = ins.bytes_in(bi) {
        println!("  bytes: {}", hex::encode(bytes));
    }

    // Typed accessors — no string parsing, no allocation.
    if ins.is_call() {
        println!("  -> call");
    }
    if ins.is_conditional_jmp() {
        println!("  -> Jcc to 0x{:x}", ins.near_branch_target());
    }
    if ins.mnemonic_enum() == Mnemonic::Xor
        && ins.op_count() == 2
        && ins.op_kind(0) == OpKind::Register
        && ins.op_kind(1) == OpKind::Register
        && ins.op_register(0) == ins.op_register(1)
    {
        println!("  -> register clear ({:?})", ins.op_register(0));
    }
    if ins.flow_control() == FlowControl::Return {
        println!("  -> return");
    }
}

Requirements

Rust 1.95 or newer (2024 edition).

Why a Rust port?

smda-rs exists to give capa-rs and other Rust-side static-analysis tools a fast, dependency-light recursive disassembler without pulling in capstone, vivisect, or a Python runtime.

Used by

capa-rs — static capability extractor for PE / ELF / shellcode / .NET binaries.

License

Licensed under the MIT License.

Acknowledgements

danielplohmann/smda — original Python implementation by Daniel Plohmann and Steffen Enders.
iced-x86 — the Rust decoder powering the disassembler backend.

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
.github/workflows		.github/workflows
assets		assets
examples		examples
src		src
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

smda

Features

Architecture-aware decoding (0.6.0)

Quick start

Typed iced accessors

Requirements

Why a Rust port?

Used by

License

Acknowledgements

About

Uh oh!

Releases 12

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

smda

Features

Architecture-aware decoding (0.6.0)

Quick start

Typed iced accessors

Requirements

Why a Rust port?

Used by

License

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 12

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages