Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
71 commits
Select commit Hold shift + click to select a range
e4194f2
Initial changelog commit
malx-labs May 18, 2026
af20071
Add RVA Graph invariants for zero-size directories and raw-offset mis…
malx-labs May 18, 2026
87145e9
Ensures overlay detection is skipped when a section has no raw data: …
malx-labs May 18, 2026
e299875
Add initial PE load config parsing and validation subsystem: includes…
malx-labs May 18, 2026
050949d
Add load config directory config analysis to engine and supporting fi…
malx-labs May 19, 2026
3b05484
Edit the structural validation deterministic heuristic document
malx-labs May 19, 2026
1dfa50d
The crypto entropy payload binary does not define a valid Load Config…
malx-labs May 19, 2026
fb78589
The string obf tricks binary does not define a valid Load Config dir…
malx-labs May 19, 2026
6809b87
The malformed url binary does not define a valid Load Config director…
malx-labs May 19, 2026
e529b3f
The malformed domain binary does not define a valid Load Config direc…
malx-labs May 19, 2026
15d6aea
The malformed IP binary does not define a valid Load Config directory…
malx-labs May 19, 2026
029f3dd
The franken URL, domain, IP binary does not define a valid Load Conf…
malx-labs May 19, 2026
cc385d5
Add optional header to FakePE on one failing test
malx-labs May 19, 2026
387ca26
Now we have new anomalies firing, fixing integration tests
malx-labs May 19, 2026
41337a0
Add unit tests for complete coverage of load config dir validator
malx-labs May 19, 2026
896e04a
Add unit tests for additional parsers. Coverage now at 100%
malx-labs May 19, 2026
0a7d478
Update performance statistics
malx-labs May 19, 2026
b4577b4
Updated README collateral
malx-labs May 19, 2026
4e6c36a
Tighten pypi readme messaging
malx-labs May 19, 2026
06cb538
Add further load config dir adversarial fixtures and snapshots
malx-labs May 19, 2026
41de94b
Add remaining adversarial load config dir fixtures and add to contrac…
malx-labs May 19, 2026
aa90b17
Initial commit of fixture emitter
malx-labs May 20, 2026
6740792
EP mutations
malx-labs May 20, 2026
10f261b
Sections mutations
malx-labs May 20, 2026
30cdc10
Optional_header mutations
malx-labs May 20, 2026
b393a42
RVA_graph mutations
malx-labs May 20, 2026
913fd0e
TLS mutations
malx-labs May 20, 2026
db111ac
Signature mutations
malx-labs May 20, 2026
68bb5c2
Resources mutations
malx-labs May 20, 2026
f3ef777
Entropy mutations
malx-labs May 20, 2026
3bb63b5
Emitter main
malx-labs May 20, 2026
cc2a388
Fix emitter failure
malx-labs May 20, 2026
1850b30
Fixture checker and definitve fixture list
malx-labs May 20, 2026
f67b86a
Fix: Import missing PEFormatError in load_config parser. Discovered v…
malx-labs May 20, 2026
bbb692f
Add full PE fixture corpus (99 files) for structural validation. Incl…
malx-labs May 20, 2026
24f43f1
Reframe the aster Structural Adversarial Fixture Corpus document
malx-labs May 20, 2026
4c104ef
Fixture 000 contract: Even a trivial mutation (EP = 0) cascades into …
malx-labs May 21, 2026
4ef49d5
Fixture 001 contract: EP RVA is a wrapped negative value (0xFFFFFFFF)…
malx-labs May 21, 2026
ea2aa08
Fixture 002 contract: EP lies inside the PE headers (RVA < SizeOfHeaders
malx-labs May 21, 2026
f33cccc
Fixture 003 contract: Loader-confusion edge case: An EP that lies in …
malx-labs May 21, 2026
0bcb4ec
Fixture 004 contract: Loader-confusion edge case: The EP lies inside …
malx-labs May 21, 2026
888a7b2
Fixture 005 contract: Classic malformed-binary condition: EP lies ins…
malx-labs May 21, 2026
474f6d1
Fixture 006 contract: If a loader or tool respects , EP may point int…
malx-labs May 21, 2026
7085aaa
Fixture 007 contract: Classic edge case: The EP is inside a section t…
malx-labs May 21, 2026
0290342
Fixture 008 contract: EP is logically associated with the first secti…
malx-labs May 21, 2026
ef7a221
Fixture 009 contract: The EP is not inside any section and is beyond …
malx-labs May 21, 2026
9a53088
Fixture 010 contract
malx-labs May 21, 2026
a52e7d4
Fixture 011 contract: Classic malformed binary condition (packers, ob…
malx-labs May 21, 2026
b8b3e79
Fixture 012 contract: A section that looks like code (by name) but is…
malx-labs May 21, 2026
d834212
Fixture 013 contract: Classic fuzzing case - many other PE parsers wo…
malx-labs May 21, 2026
ac5895a
Fixture 014 contract: Same heuristics applied as the previous example.
malx-labs May 22, 2026
091f4cf
Fixture 015 contract: Discardable code that is also RWX is nonsensica…
malx-labs May 22, 2026
0a67fcf
Fixture 016 contract: This is a classic adversarial trick: some packe…
malx-labs May 22, 2026
52219e4
Fixture 017 contract: A section whose RawAddress overlaps the PE head…
malx-labs May 22, 2026
4e9705a
Remove stale emitter and checker
malx-labs May 23, 2026
c72cdba
Only perform contract tests for fixtures that have corresponding snap…
malx-labs May 23, 2026
4eb1bbf
Fixture 016 rebuild and contract test
malx-labs May 23, 2026
5dd3226
Rebuilt fixtures with corrected emitter
malx-labs May 23, 2026
0457645
Inject PEEXv1 branding into fixtures
malx-labs May 23, 2026
521273b
Revalidate heuristic output following emitter correction: empty secti…
malx-labs May 23, 2026
eb144ec
PE: Support declared NumberOfRvaAndSizes. Add explicit NumberOfRvaAnd…
malx-labs May 25, 2026
eeaa386
Feat(Fixtures): regenerate all 99 fixtures using PAAX-branded emitter…
malx-labs May 25, 2026
c110e8f
Contract tests that cover adversarial manipulations of the Data Direc…
malx-labs May 25, 2026
2c1bc94
Add load config layer 2 edge C source
malx-labs May 26, 2026
fed1ee1
Edit build commands
malx-labs May 26, 2026
6a4b6a6
Comprehensive Layer‑2 Load Config Edge‑Case Fixture
malx-labs May 26, 2026
6253262
Add v0.7.x series exec summary
malx-labs May 26, 2026
6df7209
Return project coverage to 100%
malx-labs May 26, 2026
b5eada0
Final changes to readme and performance updates
malx-labs May 26, 2026
2b475ab
Tighten up changelog for v0.7.4
malx-labs May 26, 2026
07cf492
Tighten changelog ahead of release
malx-labs May 26, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
159 changes: 159 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,162 @@
# **v0.7.4 — Advanced Directory Parsing & Metadata Expansion**

IOCX v0.7.4 significantly expands static PE coverage with advanced directory parsing, extended metadata extraction, and deterministic structural validation. This release improves correctness across modern compiler outputs while preserving IOCX’s static‑only, zero execution design.

---

## **Added**

### **New RVA‑Graph Invariants**
- **DATA_DIRECTORY_ZERO_SIZE_NONZERO_RVA**
Detects directories that simultaneously signal presence (non‑zero RVA) and absence (zero size).
Implemented with primary‑error semantics to suppress downstream mapping noise.

- **DATA_DIRECTORY_RAW_MISMATCH**
Flags directories whose RVA maps into a section’s virtual range but whose computed raw offset lies outside the section’s raw data.
Includes a dedicated reason code and validator‑level consistency check.

- **Raw‑mapping safety guard**
Prevents invalid raw‑offset calculations when sections contain no raw data.

### **New Adversarial Fixtures for Directory Invariants**
- `directory_zero_size_nonzero_rva.full.exe`
- `directory_raw_mismatch.full.exe`

### **Full Load Config Directory Parsing**
- GuardCF metadata
- Security cookie
- SEH table
- Compiler‑specific layout hints
- Deterministic error handling for malformed structures

### **Load Config Adversarial Fixtures**
- `load_config_cookie_too_small.full.exe`
- `load_config_malformed_size_too_small.full.exe`
- `load_config_malformed_truncated.full.exe`
- `load_config_malformed_cookie_in_overlay.full.exe`
- `load_config_malformed_cookie_invalid.full.exe`
- `load_config_malformed_guard_cf_inconsistent.full.exe`
- `load_config_malformed_seh_invalid.full.exe`
- `load_config_malformed_size_exceeds_section.full.exe`

---

## **99 Adversarial PE Fixtures for Structural & Parser‑Behaviour Testing**

### **Entrypoint Fixtures (000–009)**
Covers malformed `AddressOfEntryPoint` conditions:
- Zero/negative EP
- EP inside headers
- EP outside `SizeOfImage`
- EP unmapped to any section
- EP in non‑executable section
- EP spanning boundaries
- EP in overlay

**Outcome:** Entrypoint validator stable and deterministic across all malformed cases.

---

### **Section Table Fixtures (010–021)**
Covers structural correctness of section headers and RVA/raw mappings:
- Out‑of‑bounds RVA
- Out‑of‑bounds raw offset
- Overlapping sections
- Unsorted sections
- `VirtualSize < RawSize`
- Misaligned boundaries
- Section extends past `SizeOfImage`
- Section mapped inside headers

**Outcome:** All anomalies correctly identified; no false positives on valid baselines.

---

### **Optional Header Fixtures (022–033)**
Covers correctness of Optional Header fields:
- Invalid `SizeOfImage` / `SizeOfHeaders`
- Invalid `FileAlignment` / `SectionAlignment`
- Magic mismatch (PE32 vs PE32+)
- Invalid subsystem / version fields
- ImageBase misalignment
- `NumberOfRvaAndSizes` too small

**Outcome:** Optional‑header validator behaves consistently; malformed fields reliably detected.

---

### **Data Directory Fixtures (034–045)**
Covers adversarial manipulations of the Data Directory Table:
- Negative RVA / size
- Zero/zero directory (valid)
- Zero RVA with non‑zero size
- Zero size with non‑zero RVA
- Directory inside headers
- Directory out of `SizeOfImage`
- Directory in overlay
- Unmapped directory
- Directory spanning sections
- Overlapping directories

**Outcome:**
All malformed cases correctly trigger the **primary structural anomaly**
`optional_header_invalid_number_of_rva_and_sizes`.
Fixture 036 (zero/zero) produces no anomalies, confirming non‑aggressive behaviour.

---

### **Overall Result for Fixtures 000–045**
- **All 46 fixtures validated**
- **No crashes or inconsistent behaviour**
- **All anomalies match intended design**
- Entrypoint, section, optional‑header, and directory validators confirmed stable

---

## **Comprehensive Layer‑2 Load Config Fixtures**

A full suite of Load Config edge‑case binaries validating compiler differences, malformed structures, and ambiguous layouts:

- **Minimal MinGW Load Config** (undersized structure detection)
- **Cookie‑Only (Valid)** (minimum‑size compliance, RVA mapping, section writability)
- **Cookie‑Only (Too Small)** (strict minimum‑size enforcement)
- **Full MSVC Load Config** (SEH, GuardCF, cookie, full‑path validation)
- **Full Clang/LLVM Load Config** (GuardCF without SEH)
- **Large Padded Load Config** (oversized, schema‑unknown layouts)
- **SEH‑Only Load Config** (partial‑structure handling)

**Outcome:**
Validates RVA/VA correctness, section‑mapping rules, minimum‑size enforcement, GuardCF consistency, SEH bounds checking, and compiler‑specific structural differences.

---

## **Changed**

- Load Config validator surfaced new anomalies in contract tests:
- Crypto Entropy Payload
- Franken URL Domain IP
- Malformed Domain / IP / URL
- String Obfuscation Tricks
- Invalid Optional Header (PE32 / PE32+)

- Internal schema now includes:
- `number_of_rva_and_sizes`
- `data_directories_raw`
Supporting adversarial optional‑header edge cases.

- Optional‑header validator:
- Now checks declared vs raw directory counts
- FixtureSpec and emitter updated to support adversarial `NumberOfRvaAndSizes` mismatches
- Raw vs declared count logic now fully enforced

---

## **Documentation**
- Updated RVA / Directory Anomalies table with new reason codes and behavioural notes
- Added **Design Decision: Why Only the Optional‑Header Validator Uses Raw Data Directories**

---

# v0.7.3 — Structural Correctness & Deterministic Heuristics
**Released: 2026‑05‑11**

Expand Down
21 changes: 11 additions & 10 deletions README-pypi.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
# **IOCX — Deterministic, Zero‑Risk IOC Extraction for Modern Security Pipelines**
### Official IOCX Project

**IOCX** is a high‑performance, deterministic static analysis engine for extracting Indicators of Compromise (IOCs) from binaries and text.
It exists for one reason: **to provide a fast, safe, predictable IOC extractor that DFIR teams and automation pipelines can trust.**
**IOCX** is a deterministic, high‑performance static analysis engine for extracting high-signal Indicators of Compromise (IOCs) from binaries, text, and logs.
It’s built for DFIR teams, SOC automation, CI/CD pipelines, and large‑scale threat‑intel ingestion.

**Why it matters:** IOCX guarantees snapshot‑stable output, zero‑risk static analysis, and predictable performance even under adversarial input — something regex‑only extractors simply can’t provide.

- **PyPI:** [https://pypi.org/project/iocx/](https://pypi.org/project/iocx/)
- **GitHub:** [https://github.com/iocx-dev/iocx](https://github.com/iocx-dev/iocx)
Expand Down Expand Up @@ -38,15 +40,14 @@ If you need predictable, automatable IOC extraction — IOCX is built for you.

---

## Version highlights (v0.7.3)
## Version highlights (v0.7.4)

- Major hardening of all PE structural validators
- Deterministic, snapshot‑stable output across malformed binaries
- Stronger section, entrypoint, RVA‑graph, TLS, and signature checks
- Corrected RVA→file‑offset mapping for overlay detection
- Improved entropy analysis with clearer, conservative signals
- Cleaner, consistent `ReasonCodes` across the engine
- Expanded structural + heuristic test coverage
- Full **Load Config Directory** parsing and validation
- Extended Optional Header metadata for downstream heuristics
- Structural anomaly heuristics (GuardCF, unmapped cookie, SEH issues)
- Faster, more resilient PE Analysis
- Raw IOC extraction remains world-class
- Zero regressions across all workloads

---

Expand Down
29 changes: 18 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@

<p align="center">
<a href="https://pypi.org/project/iocx/"><img src="https://img.shields.io/pypi/v/iocx?logo=pypi&logoColor=white"></a>
<img src="https://img.shields.io/badge/tests-851_passed-brightgreen">
<img src="https://img.shields.io/badge/tests-947_passed-brightgreen">
<img src="https://img.shields.io/badge/coverage-100%25-brightgreen">
<img src="https://img.shields.io/badge/python-3.12-blue">
<a href="https://github.com/iocx-dev/iocx/actions"><img src="https://img.shields.io/github/actions/workflow/status/iocx-dev/iocx/ci.yml?label=build"></a>
Expand Down Expand Up @@ -166,31 +166,31 @@ Predictable even under worst‑case adversarial load.
**150–300 MB/s** sustained throughput
Fast path — no PE parsing.

| Detector | 1 MB Time | Throughput |
|----------|-----------|------------|
| Crypto | 0.0037 s | ~270 MB/s |
| Filepaths | 0.0040 s | ~250 MB/s |
| IP | 0.0064 s | ~156 MB/s |
| Domains | 0.0033 s | ~300 MB/s |
| Detector | 1 MB Time | Throughput |
|-----------|-----------|------------|
| Crypto | 0.0037 s | ~270 MB/s |
| Filepaths | 0.0041 s | ~250 MB/s |
| IP | 0.0065 s | ~156 MB/s |
| Domains | 0.0035 s | ~300 MB/s |

---

### **2. Typical PE Files (~39 KB)**
- **0.0132 s** (typical)
- **0.0153 s** (with heuristics)
- **0.0122 s** (typical)
- **0.0145 s** (with heuristics)
- **6–15 MB/s** throughput

---

### **3. Adversarial Dense PE (1.5 MB)**
- **0.1977 s**
- **0.192 s**
- **~7.6 MB/s** throughput
- Triggers TLS anomalies, structural anomalies, anti‑debug patterns

---

### **4. Full Engine (Non‑PE)**
- **1 MB:** 0.0411 s
- **1 MB:** 0.038 s

---

Expand All @@ -200,6 +200,13 @@ Fast path — no PE parsing.
<summary><strong>Show Version History</strong></summary>
<br>

### **v0.7.4 — Advanced Directory Parsing**
- Full **Load Config Directory** parsing and validation
- Extended Optional Header metadata for downstream heuristics
- New GuardCF, cookie, anomaly heuristics
- Faster PE Analysis
- 99 PE fixtures in test suite; 45 fully spec-validated

### **v0.7.3 — Structural Correctness & Deterministic Heuristics**
- Major hardening of all PE structural validators
- Deterministic, snapshot‑stable behaviour
Expand Down
96 changes: 96 additions & 0 deletions docs/performance.md
Original file line number Diff line number Diff line change
Expand Up @@ -226,3 +226,99 @@ IOCX is designed to be:
- **Fast on malformed inputs**

Performance is a **core contract**, not an optimisation.

---

# **IOCX Performance Delta (v0.7.1 → Current Release)**

This table shows how performance has changed since **v0.7.1**, across all major workloads: raw IOC extraction, PE analysis, adversarial samples, and full‑engine processing.

### **Legend**
- **↑ Faster** (improvement)
- **→ Same** (no meaningful change)
- **↓ Slower** (regression)

---

## **1. Raw IOC Extraction (Text / Logs / Buffers)**

Throughput remains extremely high (150–300 MB/s). Variations are within noise.

| Detector | v0.7.1 | v0.7.4 | Delta | Verdict |
|-----------------|----------|----------|---------------|--------------|
| Crypto (1MB) | 0.0037 s | 0.0037 s | 0 | → Same |
| Domains (1MB) | 0.0033 s | 0.0035 s | +0.0002 s | → Same |
| Filepaths (1MB) | 0.0040 s | 0.0041 s | +0.0001 s | → Same |
| IP (1MB) | 0.0064 s | 0.0065 s | +0.0001 s | → Same |

**Summary:** Raw IOC extraction remains at peak speed with no regressions.

---

## **2. Typical PE Files (~39 KB)**

| Case | v0.7.1 | v0.7.4 | Delta | Verdict |
|-------------------------|----------|--------------|---------------|--------------|
| Typical PE | 0.0132 s | **0.0122 s** | **–0.0010 s** | **↑ Faster** |
| Typical PE + heuristics | 0.0153 s | **0.0145 s** | **–0.0008 s** | **↑ Faster** |

**Summary:** Clear improvements despite additional validators and structural checks.

---

## **3. Dense / Adversarial PE (1.5 MB)**

| Case | v0.7.1 | v0.7.4 | Delta | Verdict |
|----------|----------|--------------|---------------|--------------|
| Dense PE | 0.1977 s | **0.1921 s** | **–0.0056 s** | **↑ Faster** |

**Summary:** Dense PE analysis continues to get faster — a ~3% improvement.


---

## **4. Franken PE**

| Case | v0.7.1 | v0.7.4 | Delta | Verdict |
|------------|----------|--------------|---------------|--------------|
| Franken PE | 0.0020 s | **0.0014 s** | **–0.0006 s** | **↑ Faster** |

**Summary:** A substantial improvement (~30%). Franken PEs are now effectively “free.”

---

## **5. Full Engine (Non‑PE)**

| Case | v0.7.1 | v0.7.4 | Delta | Verdict |
|------------|----------|--------------|---------------|--------------|
| 1MB buffer | 0.0411 s | **0.0387 s** | **–0.0024 s** | **↑ Faster** |

**Summary:** End‑to‑end throughput improved by ~6%.

---

## **6. Pathological / Adversarial Inputs**

| Case | v0.7.1 | v0.7.4 | Delta | Verdict |
|-------------------|----------|----------|---------|---------|
| ETH‑like blob | 0.0012 s | 0.0012 s | 0 | → Same |
| Punycode blob | 0.0126 s | 0.0125 s | –0.0001 | → Same |
| Deep UNIX path | 0.0246 s | 0.0250 s | +0.0004 | → Same |
| IPv6 pathological | 0.0004 s | 0.0004 s | 0 | → Same |

**Summary:** Identical performance — validators do not impact non‑PE workloads.

---

# **Overall Summary**

- **Zero regressions across all workloads**
- **PE analysis is consistently faster**
- **Dense and Franken PEs show the largest gains**
- **Full‑engine throughput improved**
- **Raw IOC extraction remains at peak speed**
- **Adversarial inputs remain unaffected by new validators**

IOCX continues to get **faster**, even as the engine becomes more robust, more defensive, and more standards‑compliant.

---
Loading
Loading