feat: Add multi-model support, ONNX backend, and performance optimizations#7
Open
uriafranko wants to merge 3 commits into
Open
feat: Add multi-model support, ONNX backend, and performance optimizations#7uriafranko wants to merge 3 commits into
uriafranko wants to merge 3 commits into
Conversation
…tions Major improvements to the Rust embedding engine: ## Model Support - Add 6 embedding models: MiniLM-L6, MiniLM-L12, BGE-Small, BGE-Base, E5-Small, GTE-Small - Default changed from MiniLM-L6 (MTEB 56.3) to BGE-Small (MTEB 62.2) - ~10% accuracy improvement - BGE-Base option for maximum accuracy (MTEB 64.2, 768 dims) - All models work offline with local files - no HuggingFace API key needed ## Backend Support - Candle backend (default): Pure Rust, no external dependencies - ONNX Runtime backend (optional): Faster CPU inference with --features onnx-backend - Feature flags allow choosing backends at compile time ## Performance Optimizations - CPU optimization flags via .cargo/config.toml (target-cpu=native for AVX/SIMD) - Embedding cache (1000 entries LRU) - Dynamic embedding dimensions support (384 or 768) ## API Improvements - AgentEngine::new_fast() - MiniLM-L6 for speed - AgentEngine::new_accurate() - BGE-Base for accuracy - AgentEngine::with_config() - Full customization - Export EmbeddingModel, Backend, BrainConfig for configuration ## New Examples - benchmark.rs - Compare models and backends All tests pass (43 unit + integration tests).
Replaces multiple constructor methods with a clean builder pattern:
## Before
```rust
AgentEngine::new_fast("db")
AgentEngine::new_accurate("db")
AgentEngine::with_config("db", config)
```
## After
```rust
AgentEngine::builder()
.db_path("db")
.model(EmbeddingModel::BgeBase)
.backend(Backend::Candle)
.build()?
```
## Changes
### Brain Module
- BrainConfig::builder() -> BrainConfigBuilder
- Fluent API: .model(), .backend(), .local_model_dir(), .mock()
### AgentEngine
- AgentEngine::builder() -> AgentEngineBuilder
- Fluent API: .db_path(), .in_memory(), .model(), .backend(), .mock(), .without_metrics()
- Convenience methods retained: new(), new_in_memory(), new_mock_in_memory(), new_mock()
### Benefits
- Single entry point for configuration
- Discoverable API via IDE autocomplete
- Follows Rust builder pattern best practices
- Easy to extend with new options
- Better maintainability
All tests pass (43 unit + integration).
- Add EmbeddingModel and Backend enums to Python SDK - Update AgentEngine constructor to accept model and backend parameters - Add feature flags for candle-backend and onnx-backend - Expose model(), backend(), and embedding_dim() methods - Add comprehensive tests for new model configuration API Python SDK now supports: - Selecting embedding models (MiniLM-L6, MiniLM-L12, BGE-Small, BGE-Base, E5-Small, GTE-Small) - Selecting inference backends (Candle, ONNX, Mock) - Querying model properties (embedding_dim, mteb_score, hf_repo) Example usage: from agent_state import AgentEngine, EmbeddingModel, Backend engine = AgentEngine(model=EmbeddingModel.BgeBase, backend=Backend.Onnx)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Major improvements to the Rust embedding engine:
Model Support
Backend Support
Performance Optimizations
API Improvements
New Examples
All tests pass (43 unit + integration tests).