Skip to content

Add async timeout and retry logic to agent.act() and agent.think() #1

@simonlpaige

Description

@simonlpaige

Problem

If an LLM provider is slow or hangs, BaseAgent.act() and think() will block indefinitely. There's no timeout, retry, or fallback mechanism.

Proposed Solution

  • Add configurable timeout_seconds parameter to BaseAgent (default: 120s)
  • Wrap provider calls in asyncio.wait_for()
  • Add exponential backoff retry (max 3 attempts) on transient errors (rate limits, timeouts)
  • Emit structured log on retry/failure so the pipeline can report which stage failed and why

Bonus

Consider adding a fallback_provider option so agents can degrade gracefully (e.g., try Claude, fall back to local Ollama).

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions