Skip to content

docs: FinanceBench eval results — 45% auto-score with Claude Opus 4.6

8f8b979
Select commit
Loading
Failed to load commit list.
Open

upgrade: real LLM SDKs, multi-model tenants, RAG visualizer, 506-company FinanceBench demo, full production polish #34

docs: FinanceBench eval results — 45% auto-score with Claude Opus 4.6
8f8b979
Select commit
Loading
Failed to load commit list.

There are no checks for this commit