Skip to content

Latest commit

 

History

History
230 lines (149 loc) · 4.19 KB

File metadata and controls

230 lines (149 loc) · 4.19 KB

Troubleshooting

Common issues and solutions for AgentEval.


Agent Import Errors

ValueError: agent_ref must use 'module:attr' format

The --agent flag expects module:function format:

# Wrong
agenteval run --suite suite.yaml --agent my_agent

# Correct
agenteval run --suite suite.yaml --agent my_module:run_agent

ModuleNotFoundError: No module named 'my_module'

Ensure the module is importable from your current directory:

# Your agent file must be in the current directory or on PYTHONPATH
ls my_module.py  # Should exist

# Or install your package
pip install -e .

AttributeError: module 'my_module' has no attribute 'run_agent'

Check that the function name after : matches an exported function in the module.


Missing Dependencies

ImportError: Redis is required for distributed execution

Install the distributed extra:

pip install agentevalkit[distributed]

ImportError: scipy is required for statistical comparison

Install the stats extra for Welch's t-test:

pip install agentevalkit[stats]
# or: pip install scipy

AgentEval falls back to a pure-Python implementation if scipy is unavailable.

ImportError for adapter frameworks

Install the appropriate extra:

pip install agentevalkit[langchain]   # LangChain adapter
pip install agentevalkit[crewai]      # CrewAI adapter
pip install agentevalkit[autogen]     # AutoGen adapter

LLM Judge Grader

Error: OPENAI_API_KEY not set

The llm-judge grader requires an OpenAI API key (or compatible API):

export OPENAI_API_KEY=sk-...

You can also configure a custom API base in the grader config:

grader: llm-judge
grader_config:
  model: gpt-4o-mini
  api_base: https://your-api.com/v1

Compare Command

Error: Could not parse compare arguments

The compare command accepts two formats:

# Two single runs
agenteval compare RUN_ID_A RUN_ID_B

# Two groups (comma-separated, with 'vs')
agenteval compare RUN_A1,RUN_A2 vs RUN_B1,RUN_B2

Run IDs are the short hex IDs shown by agenteval list.

Error: Run not found

Check available runs with:

agenteval list --limit 20

YAML Suite Errors

Error: Suite file not found

Ensure the path is correct:

agenteval run --suite ./suites/my_suite.yaml

Error: Invalid suite format

Check your YAML syntax. Common issues:

  • Missing name field
  • Missing cases list
  • Incorrect indentation
  • Using tabs instead of spaces

Minimal valid suite:

name: my-tests
agent: my_module:my_fn
cases:
  - name: test-1
    input: "Hello"
    expected:
      output_contains: ["hello"]
    grader: contains

Database Issues

sqlite3.OperationalError: unable to open database file

Check that the directory exists and is writable:

# Default location
ls -la agenteval.db

# Custom location
agenteval run --suite suite.yaml --db /path/to/results.db

Corrupted database

Delete and re-run evaluations:

rm agenteval.db
agenteval run --suite suite.yaml

Redis / Distributed Execution

Workers not detected

Ensure workers are running and connected to the same Redis instance:

# Check Redis connectivity
redis-cli -u redis://localhost:6379 ping
# Should return: PONG

# Start a worker
agenteval worker --broker redis://localhost:6379 --agent my_module:my_fn

Redis authentication errors

Use an authenticated URL:

agenteval run --suite suite.yaml --workers redis://:password@host:6379

Security best practices

For production, use TLS-encrypted connections:

# Use rediss:// scheme for TLS
agenteval worker --broker rediss://:password@host:6380

# With custom CA certificate
export REDIS_CA_CERT=/path/to/ca.pem

CI Integration

Exit codes

  • 0 — All cases passed
  • 1 — One or more cases failed (or regressions detected with --fail-on-regression)

GitHub PR comments not posting

Check your token permissions:

export GITHUB_TOKEN=ghp_...  # Needs 'pull_requests: write' permission
agenteval github-comment --run-id RUN_ID --repo owner/repo --pr 123

See docs/github-actions.md for full CI setup.