Skip to content

Longer-TTL cache for sustained throughput #2

@Product-nomad

Description

@Product-nomad

v1 uses Anthropic's ephemeral cache (5-minute TTL). Rate-limit retries during a run can blow through TTL and trigger re-cache writes — observed twice in the v4 run on a 71-obligation document. Moving to a longer-TTL cache tier (or restructuring to keep call cadence inside TTL) would improve hit rate from 97% to ~100% on long runs. Anthropic's docs note when extended-TTL caching is available; check current pricing before committing. Tracked for v2.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestperformancecost / latency / throughputv2enhancement scheduled for v2

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions