Skip to content

Automated architecture documentation generator from source code. Generates diagrams (Mermaid, PlantUML, D2), ER diagrams, API documentation, and dependency graphs.

Notifications You must be signed in to change notification settings

emilholmegaard/doc-architect

Repository files navigation

Documentation Architecture

DocArchitect

Automated Architecture Documentation Generator from Source Code

DocArchitect scans your codebase and automatically generates architecture documentation including dependency graphs, API documentation, ER diagrams, message flow diagrams, and C4 models.

Features

  • 🔍 Multi-language support: Java, Kotlin, Python, C#/.NET, Node.js, Go
  • 📊 Multiple diagram formats: Mermaid, PlantUML, D2, Structurizr DSL
  • 🗄️ Database support: PostgreSQL, MSSQL, MongoDB
  • 📡 API detection: REST, GraphQL, gRPC, Avro schemas
  • 📬 Messaging support: Kafka, RabbitMQ, Azure Service Bus
  • 🐳 Docker packaged: Run anywhere without dependencies
  • 🔌 Plugin architecture: Easy to extend with custom scanners

Quick Start

# Pull the Docker image
docker pull ghcr.io/emilholmegaard/doc-architect:latest

# Initialize configuration in your project
docker run -v $(pwd):/workspace doc-architect init

# Generate documentation
docker run -v $(pwd):/workspace -v $(pwd)/docs:/output doc-architect scan

Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                      DocArchitect CLI (Picocli)                     │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  ┌──────────────┐    ┌───────────────────┐    ┌──────────────────┐  │
│  │   Scanners   │─ ─▶│ ArchitectureModel │───▶│    Generators   │  │
│  │  (Scanner)   │    │  (Intermediate)   │    │(DiagramGenerator)│  │
│  └──────────────┘    └───────────────────┘    └──────────────────┘  │
│         │                                              │            │
│         ▼                                              ▼            │
│  ┌──────────────┐                            ┌──────────────────┐   │
│  │ ServiceLoader│                            │  OutputRenderer  │   │
│  │   (SPI)      │                            │                  │   │
│  └──────────────┘                            └──────────────────┘   │
│                                                                     │
├─────────────────────────────────────────────────────────────────────┤
│                          Scanner Categories                         │
├─────────────────────────────────────────────────────────────────────┤
│  Dependencies:        │  APIs:              │  Messaging:           │
│  • Maven (pom.xml)    │  • Spring MVC       │  • Kafka              │
│  • Gradle             │  • JAX-RS           │  • RabbitMQ           │
│  • npm/yarn           │  • FastAPI          │  • Azure Service Bus  │
│  • pip/poetry         │  • Flask            │  • Avro Schemas       │
│  • NuGet (.csproj)    │  • ASP.NET Core     │  • AsyncAPI specs     │
│  • Go modules         │  • Express.js       │                       │
│                       │  • GraphQL          │                       │
│                       │  • gRPC/Protobuf    │                       │
├───────────────────────┼─────────────────────┼───────────────────────┤
│  Database:            │  Structure:         │  Integration:         │
│  • JPA/Hibernate      │  • Module detection │  • Sokrates scope     │
│  • SQLAlchemy         │  • Service bounds   │    file generation    │
│  • Django ORM         │  • Layer analysis   │  • OpenAPI export     │
│  • Entity Framework   │                     │  • AsyncAPI export    │
│  • Mongoose           │                     │                       │
│  • SQL migrations     │                     │                       │
└───────────────────────┴─────────────────────┴───────────────────────┘

Scanner Configuration Modes

DocArchitect offers three flexible ways to configure which scanners analyze your codebase:

1. AUTO Mode (Recommended for New Projects)

Let DocArchitect automatically detect and enable all applicable scanners based on your project's structure and technologies. Scanners self-filter based on file presence and applicability.

scanners:
  mode: auto

When to use: Initial documentation generation, exploring a new codebase, or when you want comprehensive coverage without manual configuration.

2. Groups Mode

Enable scanners by technology group for targeted scanning of specific language ecosystems.

scanners:
  groups:
    - java      # Maven, Gradle, Spring, JPA, Kafka Streams
    - python    # pip/poetry, FastAPI, Flask, SQLAlchemy, Django
    - dotnet    # NuGet, ASP.NET Core, Entity Framework
    - javascript # npm, Express.js
    - go        # go.mod dependencies

When to use: Multi-language monorepos where you want to focus on specific technology stacks.

3. Explicit Mode

Specify exact scanner IDs for fine-grained control.

scanners:
  enabled:
    - maven-dependencies
    - spring-rest-api
    - jpa-entities
    - kafka-messaging

When to use: Production CI/CD pipelines, when you need precise control over scanning behavior, or to optimize scan performance.

Available Scanner IDs: See Scanner Reference for complete list.

Configuration

Create a docarchitect.yaml in your project root:

project:
  name: "My Microservices"
  version: "1.0.0"

repositories:
  # Single repo (mono-repo mode)
  - name: "monorepo"
    path: "."
    
  # Or multiple repos
  # - name: "user-service"
  #   path: "./services/user-service"
  # - name: "order-service"
  #   url: "https://github.com/org/order-service"
  #   branch: "main"

scanners:
  # AUTO mode: Let DocArchitect automatically detect and enable scanners
  # based on your project's technologies
  mode: auto

  # OR use explicit mode: Specify which scanners to enable
  # enabled:
  #   - dependencies
  #   - rest-api
  #   - graphql
  #   - kafka
  #   - database

  # OR use groups mode: Enable scanners by technology group
  # groups:
  #   - java
  #   - python
  #   - dotnet

generators:
  default: mermaid
  enabled:
    - mermaid
    - markdown

output:
  directory: "./docs/architecture"
  generateIndex: true

Output

DocArchitect generates a complete documentation site:

docs/architecture/
├── index.md                    # Main entry point
├── overview/
│   ├── system-context.md       # C4 Level 1
│   └── container-diagram.md    # C4 Level 2
├── components/
│   └── [service-name].md       # Per-service documentation
├── dependencies/
│   ├── dependency-graph.md     # Visual dependency graph
│   └── dependency-matrix.md    # Tabular view
├── api/
│   ├── rest-endpoints.md       # REST API catalog
│   ├── graphql-schema.md       # GraphQL types and queries
│   └── grpc-services.md        # gRPC service definitions
├── data/
│   ├── er-diagram.md           # Entity relationship diagram
│   └── entity-catalog.md       # Entity documentation
├── messaging/
│   ├── kafka-topics.md         # Topic catalog
│   └── event-flows.md          # Message flow diagrams
└── integration/
    └── sokrates-scope.json     # Generated Sokrates config

CI/CD Integration

DocArchitect supports lightweight CI/CD mode for detecting significant changes:

# GitHub Actions example
- name: Check Architecture Changes
  run: |
    docker run -v $(pwd):/workspace doc-architect diff \
      --baseline docs/architecture/.baseline.json \
      --output docs/architecture \
      --fail-on-breaking-changes

For full CI/CD setup with security scanning, see docs/ci-cd-setup.md.

Code Quality Reports

Sokrates Analysis

Weekly automated code analysis is performed using Sokrates, a polyglot source code examination tool.

Reports include metrics on:

  • Code volume and language breakdown
  • Duplication analysis
  • File/unit size distributions and conditional complexity
  • Component decomposition and dependencies
  • File age, change frequency, and contributor statistics
  • Temporal trends and patterns

The analysis runs automatically every Monday at 2 AM UTC via GitHub Actions and publishes results to GitHub Pages.

Extending DocArchitect

Adding a Custom Scanner

  1. Implement the Scanner interface
  2. Register via META-INF/services/com.docarchitect.core.scanner.Scanner
  3. Package as JAR and mount in Docker

See docs/extending.md for details.

Development

# Build
./mvnw clean package

# Run tests
./mvnw test

# Build Docker image
docker build -t doc-architect .

See docs/testing.md for comprehensive testing guide.

Logging Configuration

DocArchitect uses Logback for logging with the following defaults:

  • Log Level: INFO (configurable via LOGBACK_LEVEL environment variable)
  • Output: Console only
  • Package-specific levels:
    • Scanners: SCANNER_LOG_LEVEL (default: INFO)
    • Generators: GENERATOR_LOG_LEVEL (default: INFO)
    • Renderers: RENDERER_LOG_LEVEL (default: INFO)

Adjusting Log Levels

# Set global log level to DEBUG
docker run -e LOGBACK_LEVEL=DEBUG -v $(pwd):/workspace doc-architect scan

# Enable DEBUG logging for scanners only
docker run -e SCANNER_LOG_LEVEL=DEBUG -v $(pwd):/workspace doc-architect scan

# Maven: Set log level for tests
mvn test -Dlogback.level=DEBUG

Custom Logback Configuration

For advanced logging needs, mount a custom logback.xml:

docker run -v $(pwd)/logback.xml:/app/logback.xml \
  -v $(pwd):/workspace \
  doc-architect scan

License

MIT License - see LICENSE

About

Automated architecture documentation generator from source code. Generates diagrams (Mermaid, PlantUML, D2), ER diagrams, API documentation, and dependency graphs.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors 3

  •  
  •  
  •  

Languages