MetaGraph processes untrusted binary bundles and user-provided graph data, making it a critical security boundary. This document identifies attack vectors, assets, trust boundaries, and mitigations for the MetaGraph core library.
Security Goals: Confidentiality, Integrity, Availability Primary Threats: Malicious bundles, memory corruption, denial of service Trust Boundary: MetaGraph library ↔ Bundle files and user input
- Process Memory - Prevent corruption, information leakage
- System Resources - CPU, memory, file handles, disk space
- Data Integrity - Graph consistency, bundle authenticity
- Application Availability - Prevent crashes, infinite loops, resource exhaustion
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Host Process │────│ MetaGraph Core │────│ Bundle Files │
│ (Trusted) │ │ (Trust Boundary)│ │ (Untrusted) │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│
┌─────────────────┐
│ User Input │
│ (Untrusted) │
└─────────────────┘
Attacker Goal: Bypass validation, trigger buffer overflows Attack Vector: Modified magic numbers, invalid sizes, corrupted checksums Impact: Memory corruption, crashes, potential RCE
Mitigations:
- ✅ Comprehensive header validation before processing
- ✅ BLAKE3 cryptographic integrity verification
- ✅ Size bounds checking against available memory/disk
- ✅ Format UUID validation for version compatibility
Attacker Goal: Forge valid checksums for malicious data Attack Vector: Exploit hash algorithm weaknesses Impact: Bypass integrity checks, corrupt graph data
Mitigations:
- ✅ BLAKE3 immune to length extension attacks (unlike SHA-1/SHA-2)
- ✅ Separate header and content hashes prevent cross-contamination
- ✅ Hash verification before any data processing
Attacker Goal: Trigger integer wraparound in memory calculations Attack Vector: Large size values causing allocation wraparound Impact: Buffer overflows, memory corruption
Mitigations:
- ✅ Explicit overflow checking using C23
ckd_add()functions - ✅ Maximum size limits enforced at bundle load time
- ✅ 64-bit size fields prevent most practical overflow scenarios
Attacker Goal: Access memory outside allocated regions Attack Vector: Invalid section offsets pointing beyond bundle boundaries Impact: Segmentation faults, information disclosure
Mitigations:
- ✅ Bounds checking for all section offsets against total bundle size
- ✅ Memory mapping with guard pages to catch offset errors
- ✅ Pointer validation before dereference in hot paths
Attacker Goal: Overwrite adjacent memory structures Attack Vector: Asset content larger than declared size Impact: Code execution, privilege escalation
Mitigations:
- ✅ Strict bounds checking in all copy operations
- ✅ AddressSanitizer validation in debug builds
- ✅ Safe string handling using
strncpy_s()equivalents
Attacker Goal: Access freed memory containing sensitive data Attack Vector: Concurrent graph modifications during traversal Impact: Information disclosure, corruption, crashes
Mitigations:
- ✅ Reference counting for shared graph nodes
- ✅ RCU-style memory reclamation for lock-free operations
- ✅ Memory poisoning in debug builds to catch UAF early
Attacker Goal: Trigger memory allocator corruption Attack Vector: Error conditions causing multiple cleanup attempts Impact: Heap corruption, potential RCE
Mitigations:
- ✅ Consistent ownership patterns with RAII cleanup
- ✅ Memory debugging with mimalloc's double-free detection
- ✅ Automated static analysis with ownership tracking
Attacker Goal: Exhaust system memory or CPU Attack Vector: Bundles with millions of nodes/edges Impact: System unresponsiveness, OOM crashes
Mitigations:
- ✅ Configurable memory limits enforced by memory pools
- ✅ Lazy loading of graph sections to limit initial memory usage
- ✅ Memory pressure callbacks for graceful degradation
Attacker Goal: Trigger worst-case algorithm performance Attack Vector: Carefully crafted graphs causing O(n²) behavior Impact: CPU exhaustion, application timeouts
Mitigations:
- ✅ Hash table load factor monitoring to prevent O(n) lookups
- ✅ Timeout mechanisms for graph traversal operations
- ✅ Cycle detection to prevent infinite loops
Attacker Goal: Hang application threads indefinitely Attack Vector: Circular references despite DAG constraints Impact: Thread exhaustion, application freeze
Mitigations:
- ✅ Visited node tracking in all traversal algorithms
- ✅ Maximum depth limits to bound recursion
- ✅ Cooperative cancellation tokens for long operations
Attacker Goal: Extract sensitive data from process memory Attack Vector: Uninitialized memory or padding bytes in structures Impact: Information disclosure, privacy violation
Mitigations:
- ✅ Explicit memory initialization of all allocated structures
- ✅ Memory scanning tools to detect uninitialized reads
- ✅ Structure padding explicitly zeroed in constructors
Attacker Goal: Infer sensitive information from operation timing Attack Vector: Measure hash table lookup times to deduce content Impact: Asset fingerprinting, cache attacks
Mitigations:
- ✅ Constant-time comparison functions for cryptographic hashes
- ✅ Random delays in debug builds to detect timing dependencies
- ✅ Hash table design resistant to timing analysis
- Setup: Attacker provides a bundle file with corrupted header
- Attack: Bundle claims 1KB size but contains 1GB of data
- Expected Defense: Header validation rejects bundle before memory allocation
- Fallback: Memory mapping fails safely, error returned to caller
- Setup: Application reads graph while another thread modifies it
- Attack: Race condition causes use-after-free on graph node
- Expected Defense: RCU prevents memory reclamation during read
- Fallback: AddressSanitizer detects UAF and terminates safely
- Setup: Bundle contains 10M nodes in a single hyperedge
- Attack: Memory allocation for edge structure exceeds system limits
- Expected Defense: Memory pool limit triggers graceful failure
- Fallback: OOM handler provides diagnostic error message
- clang-tidy: Memory safety, undefined behavior detection
- PVS-Studio: Commercial static analysis for complex vulnerabilities
- Coverity: Integer overflow and buffer overflow detection
- AddressSanitizer: Memory corruption detection during execution
- ThreadSanitizer: Race condition and data race detection
- MemorySanitizer: Uninitialized memory access detection
- libFuzzer: Structure-aware fuzzing of bundle parsing code
- AFL++: Coverage-guided fuzzing with custom dictionaries
- Property Testing: Invariant checking during fuzz campaigns
- Bundle Corpus: Collection of malformed bundles for validation
- Stress Testing: Large graphs under memory pressure
- Concurrency Testing: Race condition detection under load
- Critical: Remote code execution, privilege escalation
- High: Memory corruption, denial of service
- Medium: Information disclosure, logic errors
- Low: Performance degradation, minor leaks
- Critical: 24-hour disclosure, immediate patch
- High: 72-hour disclosure, patch within 1 week
- Medium: 30-day disclosure, patch within 1 month
- Low: Next release cycle
- Security Reports: james@flyingrobots.dev (encrypted)
- Public Disclosure: GitHub Security Advisories
- User Notification: Release notes and security bulletins
- SLSA v1.1: Build provenance and supply chain security
- CVE: Common Vulnerabilities and Exposures tracking
- CWE: Common Weakness Enumeration reference
- MISRA C: Safety-critical coding standards (subset)
- SEI CERT C: Secure coding practices
- ISO/IEC 27001: Information security management
Document Version: 1.0 Last Updated: 2025-07-20 Review Schedule: Quarterly or after security incidents Approved By: Development Team
This threat model is a living document and should be updated as new threats emerge or system architecture changes.