Releases: StevenBtw/deriva
Release 0.6.9
v0.6.9 - Benchmarks, Relationships & Analysis (March 1, 2026)
I have been running a lot of benchmarks on flask_invoice_generator, full-stack-fastapi-template and taiga-back/taiga-front. Besides a lot of new config versions, I also added a few improvements to further reduce tokens per run and added some things to make my life easier while benchmarking and trying to get the % up without any overly canonical or repository specific prompts. Currently not breaking the 60% barrier, for relationships (the hardest one).
CLI
- Single Step Execution: New
--only-stepoption forruncommand to run a single extraction/derivation step (disables all others) - Benchmark Step Isolation: New
--only-extraction-stepand--only-derivation-stepoptions for benchmark runs - Enrichment Cache Control: New
--nocache-enrichment-configsoption for selective cache bypass during benchmarks - Sequence Reordering: New
config sequencecommand to reorder derivation step execution (e.g., bottom-up: Technology → Application → Business) - Read-Only Config Access: New
config querycommand for safe config access during benchmark runs (non-blocking) - Batch Size in CLI: Added
--batch-sizeoption toconfig updatefor extraction batching
Extraction
- Edge Extraction Module: New
edges.pyfor Tree-sitter based relationship extraction. Extracts IMPORTS, USES, CALLS, DECORATED_BY, and REFERENCES edges in a single efficient parse per file. Language-specific filter constants for Python, JavaScript, Java, and C#. Fixed type node ID format mismatch that caused REFERENCES edge creation failures - Directory Classification Step: New extraction step after directories to create technology and business concept nodes (with batched LLM calls), guiding subsequent LLM extraction
- Structural Technology Extraction: New extraction method for Technology nodes from infrastructure files (docker-compose.yml, Dockerfile, .env) without LLM
- Token Efficiency: Compact JSON serialization (~15% savings), system/user prompt separation, and multi-file batching (
--batch-size N). Estimated 40-60% total reduction - Error Context: Error messages now include step context (e.g.,
[Extraction - TypeDefinition] error...) for easier debugging
Derivation
- HybridDerivation Base: All 13 modules use hybrid filtering combining pattern-based AND graph-based candidate selection
- Edge-Aware Relationships: New Tier 1.5 derivation using CALLS, IMPORTS, USES edges with high confidence (0.90-0.95)
- Pre-Generation Dedup: Fuzzy matching against existing elements before LLM calls
- Business Layer: BusinessProcess detects orchestrator methods (3+ CALLS), BusinessEvent detects webhooks/signals, BusinessActor detects auth decorators
- Relationship Consolidation: Refine step boosts confidence on multi-signal agreement, prunes low-confidence without corroboration
- Self-Loop Prevention: Fixed self-referential relationships in graph_relationships with query filters and cleanup
- Enrichment Cache: Aligned with LLM cache patterns, CLI control via
--no-enrichment-cache
Adapters
- Pydantic Structured Output: New
schemas.pywith Pydantic models for all extraction types. LLM manager auto-resolves JSON schemas to models, enforcing structure via PydanticAI - Rate Limiting: Adaptive throttling (auto-reduces RPM on 429s), circuit breaker pattern, Retry-After header respect, error classification, and model-specific rate limits via env vars
- Graph Metadata: Element properties now include all graph metrics (kcore, articulation points, degree); propagated to relationships
- Graph Labels: Split Neo4j labels into namespace (Graph/Model) and node type, enabling cleaner queries and consistent naming across extraction and derivation
- Database Locking: Non-blocking during benchmarks/pipeline runs, versions used for isolation
Full Changelog: v0.6.8...v0.6.9
Release 0.6.8
v0.6.8 - Library Migrations & Overall Cleanup
Big migration replacing 6 custom implementations with off-the-shelf libraries, reducing the amount of code and improving maintainability.
LLM Adapter Rewrite
- PydanticAI Integration: Replaced custom REST provider implementations with
pydantic-ailibrary - Model Registry: New
model_registry.pymaps Deriva config to PydanticAI model identifiers with URL normalization for Azure/LM Studio - Code Reduction: Same (or better) LLM adapter with way less code, deleted entire
providers.py - Native Structured Output: PydanticAI handles validation and retry automatically
Configuration
- Pydantic Settings: New
config_models.pywith type-safe environment validation usingpydantic-settings - Standard API Keys: Added PydanticAI standard env vars (
OPENAI_API_KEY,ANTHROPIC_API_KEY,MISTRAL_API_KEY,AZURE_OPENAI_*)
Caching
- diskcache Integration: Replaced custom SQLite-based caching with
diskcachelibrary - Simplified Cache Utils: Rewrote
cache_utils.pyto wrap diskcache withBaseDiskCacheclass - Preserved Features: Kept
hash_inputs(),bench_hashisolation, andexport_to_json()functionality - LLM & Graph Caches: Updated both adapters to use new base cache class
Retry Logic
- backoff Library: Replaced custom retry implementation with
backofflibrary - New retry.py: Centralized retry decorator with exponential backoff and jitter
- Simplified Rate Limiter: Token bucket rate limiting now separate from retry logic
Small CLI refactor
- Typer Framework: Replaced argparse-based CLI with
typer - Command Modules: Split CLI into
deriva/cli/commands/with separate files forbenchmark.py,config.py,repo.py,run.py - Modern CLI Features: Auto-completion, better help generation, type hints via
Annotated - Subcommand Groups:
config,repo,benchmarkas typer subapps
Logging
- structlog Integration: Rewrote
logging.pywithstructlogfor structured logging - Preserved API: Same RunLogger, StepContext, and JSONL output format
- OCEL Unchanged: OCEL module kept intact for benchmark process mining
Tests & Quality
- CLI Tests Rewritten: Updated all 51 CLI tests to use typer's
CliRunner - Tree-sitter Test Consolidation: Merged per-language test files into single
test_languages.py - Coverage Threshold: Updated CI coverage threshold to 80%
Full Changelog: v0.6.7...v0.6.8
Release 0.6.7
v0.6.7 - (January 15 2026)
Caching & Performance
- Graph Cache: New
cache.pyin graph adapter with hash-based cache for expensive graph queries - Common Cache Utils: Shared
cache_utils.pymodule unifying cache patterns across graph and LLM adapters
Pipeline Phases
- Derivation Prep Phase: Renamed
enrichphase toprepthroughout codebase (modules, services, configs, CLI, tests) - Extraction Phases: Added
--phase classifyand--phase parseoptions to extraction CLI for granular control
Configuration Rationalization
- Settings Principle: New "Who Changes It" architecture -
.envfor ops/deployment (secrets, connections, provider settings), database for user tuning (algorithms, thresholds) - Algorithm Settings in DB: PageRank damping/iterations/tolerance, Louvain resolution, confidence thresholds, batch sizes now in
system_settingstable - LLM Settings in .env: Rate limits, timeouts, backoff config remain in environment (provider-specific operational settings)
Benchmarking
- Rich Progress Bars: Fixed phase tracking in CLI benchmark runs with proper Rich progress display
- Per-Repo Flag: New
--per-repoflag for running multiple repositories without combining results - XML Export: Changed default export format from
.archimateto.xmlfor broader compatibility
Documentation
- MD Files Review: Comprehensive pass on all markdown files for accuracy and consistent style
- Config Pattern Docs: Updated CONTRIBUTING.md with configuration ownership table and rationale
Fixed
- Graph bugs: Fixed Neo4j relationship syntax in structural_consistency.py and fixed bug in duplicate_elements.py
- bench-hash Cache Fix: Fixed cache hit detection in manager.py
Updated
- Smarter retries: Added retry-after header parsing to rate_limiter.py and updated providers.py to pass headers to rate limiter
- Muted Neo4j: Suppressed Neo4j notifications during benchmark runs, with toggle in .env
Full Changelog: v0.6.6...v0.6.7
Release 0.6.6
v0.6.6 - ElementDerivationBase & Document Parsing (January 13 2026)
Derivation Module Refactoring (MAJOR)
Major code reduction through new inheritance hierarchy eliminating ~80% duplication across 13 element modules:
- ElementDerivationBase: New abstract base class providing common
generate()flow shared by all element types - PatternBasedDerivation: Mixin for modules using include/exclude pattern filtering (10 of 13 modules)
- Unified Batch Processing:
_process_batch()and_derive_relationships()methods handle common operations - Singleton Pattern: Each module exports
ELEMENT_TYPE,OUTBOUND_RULES,INBOUND_RULESfor backward compatibility
All 13 derivation modules (ApplicationComponent, ApplicationInterface, ApplicationService, BusinessActor, BusinessEvent, BusinessFunction, BusinessObject, BusinessProcess, DataObject, Device, Node, SystemSoftware, TechnologyService) now inherit from the base classes.
Document Parsing Support (NEW)
Added core support for extracting text from office documents to limit token usage:
- PDF Parsing:
pypdf>=6.6.0as core dependency - DOCX Parsing:
python-docx>=1.2.0as core dependency - Moved from optional
[documents]extra to required dependencies
Structured Error Handling
New structured error context system in common/types.py:
- ErrorContext: Dataclass with repo_name, step_name, phase_name, file_path, batch_number, exception_type
- create_error(): Convenience function for formatted error strings with context
- PipelineResult Extensions: Added
error_details,partial_success,affected_itemsfields
Config Service Enhancements
New threshold and limit helpers in services/config.py:
- Confidence Thresholds:
get_confidence_threshold(),get_min_relationship_confidence(),get_community_rel_confidence(), etc. - Derivation Limits:
get_derivation_limit(),get_max_batch_size(),get_max_candidates(), etc. - Centralized Defaults:
_DEFAULT_THRESHOLDSand_DEFAULT_LIMITSdictionaries with configurable overrides via system_settings
Test Coverage Improvements
- Coverage Target: Increased minimum coverage from 75% to 80%
- New Test Modules:
test_adapters/treesitter/,test_common/test_document_reader.py,test_common/test_types.py - Coverage Exclusions: Added exclusions for graph_relationships.py and benchmarking.py (require infrastructure)
Documentation Updates
- New
ARCHITECTURE.MDdocumenting system design - New
OPTIMIZATION.mdwith performance tuning guidance - Updated adapter and CLI READMEs with usage examples
- Enhanced docstrings across LLM manager and config service
Minor Fixes
- Fixed type checker issue in
get_model_token_limit()with explicit None check
v0.6.5 - Tree-sitter Multi-Language Support & Relationship Consistency (Unreleased)
Tree-sitter Adapter (NEW)
Complete replacement of Python's ast module with tree-sitter for multi-language code analysis:
- Multi-Language Support: Python, JavaScript/TypeScript, Java, and C# extraction via unified
TreeSitterManager - Language-Specific Extractors: Per-language modules in
adapters/treesitter/languages/with proper grammar loading - Deterministic Extraction:
extract_types(),extract_methods(),extract_imports()methods for precise structural analysis - Backwards Compatibility: Maintained
extract_types_from_sourceandextract_methods_from_sourcealiases in extraction module
The tree-sitter approach enables future expansion to additional languages (Go, Rust, Ruby) with minimal effort.
Graph-First Relationship Derivation
Major improvements to relationship consistency due to deterministic graph techniques.
- Community-Based Derivation: New
derive_community_relationships()creates relationships between elements sharing the same Louvain community (0.95 confidence) - Neighbor-Based Derivation: New
derive_neighbor_relationships()creates relationships between elements with direct graph connections (0.90 confidence) - Name/File Matching: Enhanced
derive_deterministic_relationships()with semantic word overlap and file proximity matching - Hybrid Approach: Run deterministic methods first, then LLM for all elements with deduplication against deterministic results
- Element Enrichment: All 13 derivation modules now store
source_communityandsource_pagerankproperties on elements - Edge-to-Relationship Mapping: CONTAINS→Composition, IMPLEMENTS→Realization, USES→Serving, CALLS→Flow, IMPORTS→Serving, INHERITS→Realization
- ArchiMate Constraints: Validates source/target element type combinations per ArchiMate 3.2 metamodel
- Two Query Strategies: Primary query with
source_identifier, fallback toproperties_jsonsearch
Benchmark Improvements
- Phase Tracking: Added OCEL phase events for better pipeline observability
- Structured Outputs: LLM structured output tracking in benchmark events
- Token Optimization: Reduced context size through graph-aware filtering
Bug Fixes
- Fixed chunking logic in external dependency extractor
- Fixed TypeDefinitionNode constructor for placeholder nodes (correct parameter names)
- Various test fixes for API changes
Full Changelog: v0.6.4...v0.6.6
Release 0.6.4
v0.6.4 - Benchmark with Deriva (this repo) runs stable and succesfull! (January 10 2026)
Refine Module (NEW)
New post-derivation refinement phase with 5 quality assurance steps in modules/derivation/refine/:
- Duplicate Elements: Multi-tier detection (exact match → fuzzy match → LLM semantic check) with configurable auto-merge and survivor selection based on PageRank
- Duplicate Relationships: Exact duplicate removal and redundant relationship pair detection
- Orphan Elements: Identifies unconnected elements, proposes relationships from source graph patterns, optionally disables low-importance orphans
- Structural Consistency: Validates graph-to-model containment preservation (files in directories → components in systems)
- Cross-Layer Coherence: Checks ArchiMate layer connections (Business↔Application↔Technology) and flags floating elements
The refine phase runs after generation with config-driven step enablement. Each step returns detailed RefineResult with issues found/fixed counts.
Graph Enrichment Stability Improvements
Major improvements for consistent results across different graph sizes and multi-repo setups:
- Percentile Normalization: New
normalize_to_percentiles()functions convert absolute metrics (PageRank, k-core, degree) to 0-100 percentile ranks. A node at 90th percentile means "more important than 90% of nodes" regardless of graph size (50 or 5000 nodes) - Deterministic Louvain: Fixed non-deterministic community detection by sorting nodes before algorithm execution. Same graph now produces identical community assignments every run
- Graph Metadata: New
GraphMetadatadataclass captures graph statistics (total_nodes, density, max_kcore, num_communities). Returned withEnrichmentResultand propagated to refine steps via params - Per-Repository Isolation: Added
repository_nameproperty extraction from node IDs and repo-aware edge filtering in_get_graph_edges(). Enables isolated enrichment per repo in multi-repo setups
New enrichment properties per node: pagerank_percentile, kcore_percentile, in_degree_percentile, out_degree_percentile
General Improvements
Multiple minor improvements for different parts of the process:
- Extraction Method Property: Added extraction method (structural/ast/llm) property to the graph nodes
- LLM Rate Limiting: Extended the LLM manager (adapter) with rate limiting capabilities to gracefully deal with llm provider introduced rate limits
- Status/Progress Bars: Both the Marimo app and the cli now have visual indicators of progress during pipeline runs and benchmark runs (cli only)
- Benchmark Output: Added model output to the benchmark runs, with unique names ({repo}{model}{run#}.archimate)
Full Test Pass
Removed and added a lot of tests, now fully caught up with all the changes. New test classes:
TestPercentileNormalization,TestGraphMetadata,TestPercentileEnrichmentsfor enrich module- Comprehensive refine step tests for all 5 steps
Test coverage didn't jump because I deleted a lot of weak tests. Marimo (app) tests are still excluded.
Graph Enrichment Module
New modules/derivation/enrich.py with graph algorithm pre-processing:
- PageRank: Node importance/centrality scoring
- Louvain: Community detection for natural component boundaries
- K-core: Core vs peripheral node classification
- Articulation points: Bridge node identification
- Degree centrality: In/out connectivity metrics
The enrichment runs before derivation, similar to how classification enriches files before extraction.
Unified Element + Relationship Derivation
Major refactoring to generate elements and relationships in a single step:
- New
RelationshipRuledataclass for valid relationships per element type - LLM-based relationship derivation with
derive_batch_relationships() - All 13 element modules updated with
OUTBOUND_RULESandINBOUND_RULES - Removed obsolete relationship config infrastructure (database table, config class, JSON file)
Benchmark results on flask_invoice_generator: 15 elements, 15 relationships (Access, Serving, Composition, Flow, Realization, Aggregation, Assignment)
Derivation Improvements
- Renamed
DerivationResulttoGenerationResult - New
Candidatedataclass with graph enrichment data - Helper functions:
query_candidates(),get_enrichments(),batch_candidates() - Improved element building with
build_element()andparse_derivation_response()
Extraction Base Consolidation
Refactor of modules/extraction/base.py:
- Merged input_sources.py functionality into base.py
- Name normalization functions for packages, concepts, and technologies
- Canonical package names dictionary for consistent naming
- Singularization helper with irregular plurals support
- Removed
ast_extraction.pyandinput_sources.py
Full Changelog: v0.6.3...v0.6.4
Release 0.6.3
v0.6.3 - Database Adapter and Benchmark Improvements
Database Adapter Refactor
- Replaced SQL seed files with JSON data files for better portability
- New
db_tool.pyCLI for database export/import operations - Added
data/folder with per-table JSON files - New exports:
export_database(),import_database()in package API
Benchmarking & Other Changes
- New
benchmarks.mddocumentation - Extended benchmarking service with additional metrics
- Graph manager: Added new query methods
- External dependency extractor: Major improvements
- Config service: New configuration functions
Full Changelog: v0.6.2...v0.6.3
Version 0.6.2
v0.6.2 - New Derivation Modules & LLM Provider Expansion (January 7, 2026)
New Derivation Modules
Major expansion of derivation capabilities with 6 new ArchiMate element modules:
- Added
ApplicationInterface,BusinessEvent,BusinessFunctionmodules - Added
Device,Node,SystemSoftwaretechnology layer modules - Refactored existing derivation modules to new consistent style with improved prompts and schemas
- New database scripts:
8_derivation_config_extension.sql,9_new_derivation_modules.sql
LLM Provider Expansion
- Added Mistral AI provider in
adapters/llm/providers.py - Added LM Studio provider for local LLM models like Nemotron A3B
- Fixed Claude response truncation bug in Anthropic provider
- Fixed LLM null/None response handling
- Fixed non-dict responses in external dependency extractor (strings, numbers)
Extraction Improvements
- Added
typeandsubtypeproperties to File nodes for richer classification - Improved file classification logic with better pattern matching
- Added repository sync method, removed redundant code
Relationship Derivation
- Significant improvements to relationship derivation logic
Benchmarking
- Updated
BENCHMARKING.mddocumentation - Improved benchmark workflow and usability
Bug Fixes
- Fixed failing tests, type errors, and linting issues
- Fixed derivation config issues
- Fixed extraction pipeline bugs
CI & Code Quality
- Aligned CI test coverage with
pyproject.tomlat 70% - Ruff formatting and style fixes
- Improved README badges
- Updated
CONTRIBUTING.mddocumentation
Full Changelog: v0.6.1...v0.6.2
v0.6.1
What's Changed
- Updated LICENSE by @StevenBtw in #1
- fix ci issues by @StevenBtw in #2
- Added config for codecov by @StevenBtw in #3
- Release 0.6.1 by @StevenBtw in #4
New Contributors
- @StevenBtw made their first contribution in #1
Full Changelog: https://github.com/StevenBtw/deriva/commits/v0.6.1