Skip to content

A powerful tool and library for semantic and structural diffing of binary files, going far beyond simple byte-by-byte comparison. Performs semantic diffs on binary files.

License

Notifications You must be signed in to change notification settings

BaseMax/go-bindiff-struct

Repository files navigation

go-bindiff-struct

A powerful tool and library for semantic and structural diffing of binary files, going far beyond simple byte-by-byte comparison.

Features

  • Semantic Diffing: Understand meaningful changes in binaries by analyzing structure, layout, symbols, sections, and inferred data models
  • Multi-Format Support: Works with ELF (Linux), PE (Windows), Mach-O (macOS), and raw binaries
  • Intelligent Analysis: Detects function boundaries, structure layouts, compiler artifacts, and more
  • Symbol Tracking: Identifies added, removed, modified, and renamed symbols with similarity scoring
  • Multiple Output Formats: Text (colored), JSON, and Markdown output
  • CLI & Library: Use as a command-line tool or import as a Go library
  • Extensible Architecture: Pluggable parsers and diff engines

Installation

go install github.com/BaseMax/go-bindiff-struct/cmd/go-bindiff-struct@latest

Or build from source:

git clone https://github.com/BaseMax/go-bindiff-struct.git
cd go-bindiff-struct
go build -o bin/go-bindiff-struct ./cmd/go-bindiff-struct

CLI Usage

Basic Diff

go-bindiff-struct diff old.bin new.bin

Output Formats

# JSON output
go-bindiff-struct diff --format json old.bin new.bin

# Markdown report
go-bindiff-struct diff --format markdown old.bin new.bin > report.md

# Disable colors
go-bindiff-struct diff --no-color old.bin new.bin

Summary

go-bindiff-struct summary old.bin new.bin

Symbol Comparison

go-bindiff-struct symbols old.bin new.bin

Section Comparison

go-bindiff-struct sections old.bin new.bin

Advanced Options

# Verbose output
go-bindiff-struct diff --verbose old.bin new.bin

# Custom similarity threshold
go-bindiff-struct diff --threshold 0.8 old.bin new.bin

# Load configuration from file
go-bindiff-struct diff --config config.yaml old.bin new.bin

Library Usage

Basic Example

package main

import (
    "fmt"
    "log"
    
    "github.com/BaseMax/go-bindiff-struct/api"
    "github.com/BaseMax/go-bindiff-struct/config"
)

func main() {
    // Create a new client with default config
    client := api.NewClient(config.Default())
    
    // Diff two binaries
    result, err := client.DiffBinaries("old.bin", "new.bin")
    if err != nil {
        log.Fatal(err)
    }
    
    // Print summary
    fmt.Printf("Total Changes: %d\n", result.Summary.TotalChanges)
    fmt.Printf("Similarity: %.2f%%\n", result.Summary.SimilarityScore * 100)
    fmt.Printf("Symbols Added: %d\n", result.Summary.SymbolsAdded)
    fmt.Printf("Symbols Removed: %d\n", result.Summary.SymbolsRemoved)
}

Parse a Single Binary

package main

import (
    "fmt"
    "log"
    
    "github.com/BaseMax/go-bindiff-struct/api"
    "github.com/BaseMax/go-bindiff-struct/config"
)

func main() {
    client := api.NewClient(config.Default())
    
    // Parse a binary
    binary, err := client.ParseBinary("example.elf")
    if err != nil {
        log.Fatal(err)
    }
    
    // Print metadata
    fmt.Printf("Format: %s\n", binary.Metadata.Format)
    fmt.Printf("Architecture: %s\n", binary.Metadata.Architecture)
    fmt.Printf("Sections: %d\n", len(binary.Sections))
    fmt.Printf("Symbols: %d\n", len(binary.Symbols))
}

Custom Configuration

package main

import (
    "github.com/BaseMax/go-bindiff-struct/api"
    "github.com/BaseMax/go-bindiff-struct/config"
)

func main() {
    cfg := config.Default()
    
    // Customize settings
    cfg.Similarity.Threshold = 0.8
    cfg.Diff.IgnorePadding = true
    cfg.Output.Format = "json"
    
    client := api.NewClient(cfg)
    
    // Use client...
}

Architecture

The project follows a modular architecture:

go-bindiff-struct/
├── cmd/                 # CLI entrypoints
├── parser/              # Binary format parsers (ELF, PE, Mach-O, Raw)
├── model/               # Normalized semantic models
├── diff/                # Diff engines
├── analyzer/            # Heuristics & inference
├── symbols/             # Symbol extraction & matching
├── sections/            # Section-level analysis
├── similarity/          # Fuzzy matching & scoring
├── render/              # Output rendering (text, JSON, markdown)
├── config/              # Configuration handling
├── api/                 # Public Go interfaces
├── examples/            # Usage examples
└── docs/                # Technical documentation

Supported Binary Formats

  • ELF (Executable and Linkable Format) - Linux binaries
  • PE (Portable Executable) - Windows binaries
  • Mach-O (Mach Object) - macOS binaries
  • Raw - Generic binary blobs with best-effort analysis

Diff Types

Structural Diff

  • Section added/removed/resized
  • Segment permission changes
  • Layout reordering

Symbolic Diff

  • Function added/removed
  • Function signature changes (inferred)
  • Symbol rename detection
  • Address relocation detection

Data Diff

  • Constants changed
  • String changes
  • Embedded table differences

Semantic Similarity

  • Function similarity scoring
  • Heuristic rename detection
  • Recompiled-but-equivalent code detection

Configuration

Configuration can be provided via:

  • CLI flags
  • YAML/JSON config files
  • Programmatic API

Example config.yaml:

diff:
  ignore_padding: true
  ignore_addresses: false
  ignore_timestamps: true
  verbose: false

output:
  format: text
  color: true
  verbose: false

similarity:
  threshold: 0.7
  detect_renames: true
  fuzzy_matching: true

parser:
  enabled_formats:
    - ELF
    - PE
    - Mach-O
    - Raw
  max_file_size: 1073741824

Use Cases

  • Security Research: Compare malware variants
  • Reverse Engineering: Track binary evolution
  • Firmware Analysis: Detect firmware modifications
  • Build Reproducibility: Verify build consistency
  • Software Updates: Understand what changed between versions
  • Compiler Analysis: Study compiler optimizations

Requirements

  • Go 1.21 or later
  • No external dependencies for core functionality

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for details.

License

MIT License - see LICENSE for details.

Author

Max Base - GitHub

Acknowledgments

This project provides a serious binary diffing and reverse-engineering aid suitable for security research, malware analysis, firmware comparison, and build reproducibility analysis.

About

A powerful tool and library for semantic and structural diffing of binary files, going far beyond simple byte-by-byte comparison. Performs semantic diffs on binary files.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published