Thanks for your interest in contributing! Here's how to get started.
git clone https://github.com/zenprocess/pawbench.git
cd pawbench
uv venv --python 3.12
source .venv/bin/activate
uv pip install -e ".[dev]"pytest tests/ -vScenarios are JSON files in src/pawbench/scenarios/. See existing PawStyle scenarios for the format.
Requirements for a good scenario:
- Multi-turn (3+ turns per agent)
- Tool calls (write_file, read_file, run_command)
- Injected tool results between turns
- Quality expectations (
expectblock with measurable criteria) - At least one variant should include a steering/nudge event
- Fork the repo and create a feature branch
- Add tests for new functionality
- Ensure
pytest tests/ -vpasses - Submit a PR with a clear description
- Python 3.10+
- Type hints on all public functions
- No external dependencies beyond
aiohttpandrequests - Keep scenarios self-contained (no external files or APIs needed)
Please include:
- PawBench version (
pawbench --versionorpython -c "import pawbench; print(pawbench.__version__)") - Endpoint type (vLLM, TGI, OpenAI, etc.)
- Model name
- Full error output