Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions CITATION.cff
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
cff-version: 1.2.0
message: "If you use benchflow in your research, please cite it as below."
title: "BenchFlow: framework for RL environments for LLM agents"
abstract: "BenchFlow is a framework for building RL environments to evaluate and train LLM agents. Built on the Agent Client Protocol (ACP), it provides Scene-based multi-turn, multi-agent, and multi-model evaluation in shared sandboxes — without Docker Compose or sidecar containers. Supported use cases include interactive user simulation, code-review loops, bring-your-own-skill (BYOS) skill generation, multi-turn iterative refinement, cross-model review (cheap coder + strong reviewer), and stateful service tasks against live mock APIs (Gmail, Calendar, Docs, Drive, Slack). See docs/use-cases.md."
type: software
authors:
- name: "BenchFlow team"
website: "https://github.com/benchflow-ai/benchflow"
repository-code: "https://github.com/benchflow-ai/benchflow"
url: "https://github.com/benchflow-ai/benchflow"
license: Apache-2.0
version: 0.3.2
keywords:
- benchmark
- llm-agents
- acp
- agent-evaluation
- multi-turn
- terminal-bench
- skillsbench
Loading