hexcore-remill

N-API bindings for a HikariSystem fork of Remill — lifts machine code to LLVM IR bitcode, customized for the HexCore-Helix decompiler pipeline.

Part of HikariSystem HexCore.

Relationship to upstream Remill

This package is not a thin wrapper over upstream Remill. The N-API scaffold was originally inspired by the public Remill project (Trail of Bits), but DoLift has diverged substantially to support the specific needs of HexCore-Helix as an end-to-end decompiler. In the upstream Remill + Anvill pipeline, a binary is lifted by Remill (~200–230k lines of very verbose IR for a moderate function) and then Anvill re-shapes that IR using an externally-supplied specification (function boundaries, calling conventions, stack layout). We don't ship Anvill.

Instead, the scaffold in this fork has been extended to do part of what Anvill would normally do at the pre-lift stage. CFG discovery, basic-block leader collection, function-boundary detection, and jump-table target resolution are handled by Pathfinder, a sibling extension (hexcore-pathfinder) that runs before Remill and feeds the discovered leaders into LiftOptions.additionalLeaders. Everything downstream of Remill — variable recovery, stack-frame recovery, calling-convention recovery, type inference, struct-field reconstruction — is handled by Helix (HexCore-Helix), the MLIR-based decompiler engine that consumes the IR we produce here. So Pathfinder → this fork → Helix together fill the role that Anvill + Remill fills in the Trail of Bits pipeline.

Because this is the specific configuration we target, the fork also implements desync-recovery, synthetic-NOP handling, and CFG-completeness fixes that aren't in upstream Remill. These are documented in CHANGELOG.md under the FIX-0NN tags.

Supported Architectures

Architecture	Variants
x86 (32-bit)	`x86`, `x86_avx`, `x86_avx512`
x86-64	`amd64`, `amd64_avx`, `amd64_avx512`
AArch64	`aarch64`, `aarch64_little_endian`
SPARC	`sparc32`, `sparc64`

Usage

const { RemillLifter, ARCH } = require('hexcore-remill');

const lifter = new RemillLifter(ARCH.AMD64);

// push rbp; mov rbp, rsp; pop rbp; ret
const code = Buffer.from([0x55, 0x48, 0x89, 0xe5, 0x5d, 0xc3]);
const result = lifter.liftBytes(code, 0x401000);

if (result.success) {
  console.log(result.ir);            // LLVM IR text
  console.log(result.bytesConsumed); // 6
}

lifter.close();

Async (non-blocking)

const result = await lifter.liftBytesAsync(largeBuffer, 0x140001000);

Windows ABI context

const lifter = new RemillLifter(ARCH.AMD64, OS.WINDOWS);

Pathfinder integration

const result = lifter.liftBytes(code, 0x3A20, {
  mode: 'ElfRelocatable',
  additionalLeaders: pathfinderLeaders,  // BB entry points from pre-lift CFG analysis
  knownFunctionEnds: pathfinderEnds,     // function boundaries (advisory)
  maxBytes: 65536,
  maxInstructions: 5000,
});

additionalLeaders is the main integration point with Pathfinder — every address in this list is inserted into the Phase 1.5 leaders set before the lift loop runs, so Pathfinder-discovered BBs that wouldn't be found by purely-sequential decoding (jump-table targets, .pdata function entries, ELF symtab addresses) get their own basic block in the output IR.

API

`new RemillLifter(arch, os?)`

Create a lifter for the given architecture. Loads the Remill semantics module.

arch — Architecture name (use ARCH constants)
os — OS name for ABI context (optional, defaults to 'linux')

`lifter.liftBytes(code, address, options?) → LiftResult`

Synchronous lift. Decodes and lifts instructions from the buffer, starting at address.

Pass an optional third options object to control lift limits and IR shape. Named semantic helper calls are preserved by default for downstream decompiler compatibility. Set inlineSemantics: true only when you explicitly want the semantic helper bodies inlined into the lifted function.

`lifter.liftBytesAsync(code, address, options?) → Promise<LiftResult>`

Async lift in a worker thread. Use for large buffers (>64KB).

`LiftOptions`

{
  maxInstructions?: number;         // default 2000
  maxBasicBlocks?: number;          // default 500
  maxBytes?: number;                // default 32768 (32KB)
  splitAtCalls?: boolean;           // default true
  optimizeIR?: boolean;             // default true (SROA, mem2reg, InstCombine, SimplifyCFG, DCE, ADCE, DSE)
  inlineSemantics?: boolean;        // default false (preserve named semantic helper calls)
  mode?: 'Generic' | 'PE64' | 'ElfRelocatable';  // format-specific heuristics
  additionalLeaders?: number[];     // extra BB entry points from external analysis (Pathfinder)
  knownFunctionEnds?: number[];     // function end addresses (advisory, used for tail-call detection)
}

`LiftResult`

{
  success: boolean;
  ir: string;              // LLVM IR text
  error: string;           // Error message if !success
  address: number;         // Start address
  bytesConsumed: number;   // Bytes consumed from input
  truncated: boolean;      // true if a limit was hit before all bytes consumed
  nextAddress: number;     // where to continue lifting (valid if truncated)
  callTargets: number[];   // discovered external CALL targets
  truncationReason: string; // "max_instructions" | "max_blocks" | "max_bytes"
  implicitParams: string[]; // registers read before written (function params)
}

`RemillLifter.getSupportedArchs() → string[]`

Returns list of supported architecture names.

Pipeline Position

┌──────────────┐   ┌──────────────────┐   ┌──────────────┐   ┌──────────────┐
│ hexcore-     │   │                  │   │              │   │              │
│ disassembler │→→→│   Pathfinder     │→→→│  this fork   │→→→│    Helix     │
│ (ELF/PE)     │   │   (CFG leaders,  │   │  (Remill     │   │  (decompile  │
│              │   │    boundaries)   │   │   lifting)   │   │   to C)      │
└──────────────┘   └──────────────────┘   └──────────────┘   └──────────────┘
                             │                    │
                             └────── spec ────────┘
                          (analogous to Anvill's JSON spec in
                           the Trail of Bits reference pipeline)

Building from Source

# Prerequisites: LLVM 18 static libs, Node 18+, Windows 10 SDK + MSVC (Windows),
# clang-cl / MSVC (whichever is in your node-gyp toolchain)

# Build Remill + XED deps first (see deps/README.md or unpack remill-deps-win32-x64.zip)
npx node-gyp rebuild
npm test

A prebuilt binary is shipped at prebuilds/win32-x64/hexcore_remill.node. The loader in index.js prefers a local build/Release artifact when present (useful during development), falling back to the prebuilt otherwise.

Dependencies

Remill — static library, patched fork (see CHANGELOG.md for the FIX-0NN series)
LLVM 18 — static libraries (Core, Support, BitReader, BitWriter, IRReader, etc.)
Intel XED — x86 instruction decoder (used by Remill and by FIX-024 desync recovery)

Important: Must use the same LLVM version as hexcore-llvm-mc (currently LLVM 18) to avoid symbol conflicts when both are loaded in the same process.

License

MIT — Copyright (c) HikariSystem.

Upstream Remill (Trail of Bits) is Apache-2.0. This fork preserves the upstream copyright notices in deps/remill/.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
deps		deps
src		src
test		test
.gitignore		.gitignore
.vscodeignore		.vscodeignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
binding.gyp		binding.gyp
index.d.ts		index.d.ts
index.js		index.js
index.mjs		index.mjs
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

hexcore-remill

Relationship to upstream Remill

Supported Architectures

Usage

Async (non-blocking)

Windows ABI context

Pathfinder integration

API

`new RemillLifter(arch, os?)`

`lifter.liftBytes(code, address, options?) → LiftResult`

`lifter.liftBytesAsync(code, address, options?) → Promise<LiftResult>`

`LiftOptions`

`LiftResult`

`RemillLifter.getSupportedArchs() → string[]`

Pipeline Position

Building from Source

Dependencies

License

About

Uh oh!

Releases 6

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

hexcore-remill

Relationship to upstream Remill

Supported Architectures

Usage

Async (non-blocking)

Windows ABI context

Pathfinder integration

API

new RemillLifter(arch, os?)

lifter.liftBytes(code, address, options?) → LiftResult

lifter.liftBytesAsync(code, address, options?) → Promise<LiftResult>

LiftOptions

LiftResult

RemillLifter.getSupportedArchs() → string[]

Pipeline Position

Building from Source

Dependencies

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`new RemillLifter(arch, os?)`

`lifter.liftBytes(code, address, options?) → LiftResult`

`lifter.liftBytesAsync(code, address, options?) → Promise<LiftResult>`

`LiftOptions`

`LiftResult`

`RemillLifter.getSupportedArchs() → string[]`

Packages