Skip to content

Query.close() raises 'Attempted to exit cancel scope in a different task' during async-generator teardown #983

@matt-dresden-caylent

Description

@matt-dresden-caylent

Summary

Query.close() (in claude_agent_sdk._internal.query) exits an anyio task group cancel scope from a task different from the one that entered it. Python's anyio enforces task-locality on cancel scopes; the mismatch raises RuntimeError: Attempted to exit cancel scope in a different task than it was entered in during session teardown.

The teardown fires AFTER the SDK has yielded its final ResultMessage, so the orchestration work has already succeeded. The exception is raised in a background task (not propagated to consumer code), so it surfaces as [asyncio] ERROR Task exception was never retrieved -- a Python traceback emitted to stderr after the consumer's success message.

Reproducer environment:

  • claude-agent-sdk (Python) installed via uv into Python 3.14.
  • Consumer driver: caylent-solutions/devbench cmd_start at src/devbench/cli.py:6856 runs async for message in query(prompt=..., options=ClaudeAgentOptions(plugins=[{"type": "local", "path": ...}], permission_mode="bypassPermissions")).
  • The defect surfaces on every successful orchestrator session, deterministically.

Verbatim stack trace from a 2026-05-22 production run

2026-05-22T15:07:48Z [asyncio] ERROR Task exception was never retrieved
future: <Task finished name='Task-5' coro=<<async_generator_athrow without __name__>()>
exception=RuntimeError('Attempted to exit cancel scope in a different task than it was entered in')>
Traceback (most recent call last):
  File ".../claude_agent_sdk/_internal/client.py", line 142, in process_query
    yield message
GeneratorExit

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File ".../claude_agent_sdk/_internal/client.py", line 145, in process_query
    await query.close()
  File ".../claude_agent_sdk/_internal/query.py", line 666, in close
    await self._tg.__aexit__(None, None, None)
  File ".../anyio/_backends/_asyncio.py", line 794, in __aexit__
    return self.cancel_scope.__exit__(exc_type, exc_val, exc_tb)
  File ".../anyio/_backends/_asyncio.py", line 461, in __exit__
    raise RuntimeError(
RuntimeError: Attempted to exit cancel scope in a different task than it was entered in

Root cause analysis

Query uses anyio.create_task_group() in its async context manager. The task group's __aenter__ is called on Task A (the consumer's main task that opens the SDK). When the consumer stops iterating the query(...) async generator, process_query in client.py catches the resulting GeneratorExit (line 142) and then awaits query.close() (line 145). query.close() calls self._tg.__aexit__(None, None, None) at line 666.

By the time client.py:145 runs, process_query may be executing on a different task than the one that originally entered _tg.__aenter__ -- the async-generator close machinery in CPython runs the throw-into-generator on the task that triggered the close, not necessarily on the task that opened the generator.

The fix in anyio (/anyio/_backends/_asyncio.py:461) is the defensive guard; the SDK side needs to either:

  1. Re-enter the same task when closing the task group (use anyio.from_thread / anyio.to_thread patterns, or restructure to ensure __aenter__ and __aexit__ always run on the same task).
  2. Use asyncio.shield(...) + explicit cancellation of child tasks instead of relying on anyio.create_task_group()'s cancel-scope semantics.
  3. Detect the cross-task close and skip __aexit__ when the entering task is gone (the cleanest fix; check asyncio.current_task() is self._entering_task before calling __aexit__).

I have not reproduced the fix locally; the analysis is based on reading the stack + anyio's source. The SDK team is in a better position to choose the right approach.

Reproduction (minimal driver)

import asyncio
from claude_agent_sdk import ClaudeAgentOptions, query


async def main() -> None:
    async for message in query(
        prompt="say hello",
        options=ClaudeAgentOptions(permission_mode="bypassPermissions"),
    ):
        print(type(message).__name__, message)


asyncio.run(main())

After the final ResultMessage is yielded, the asyncio event loop emits the Task exception was never retrieved ERROR with the cancel-scope stack. (The orchestrator's [ORCHESTRATOR_TERMINAL_EXIT] line in our production case is just informational logging from the consumer; the stack appears even without it.)

Downstream impact

  • Successful orchestrator runs print a Python traceback under [asyncio] ERROR to stderr.
  • Remote execution environments (CI runners, sandboxed agent envs, log-aggregation pipelines) that classify any stderr ERROR line as a failure mis-classify the run.
  • Operators reading the tail of the orchestrator log see a stack trace AFTER the success message.

We are shipping a downstream workaround (a narrow asyncio-loop exception-handler filter that downgrades this exact signature to WARNING with a tracking-issue link) so our remote environments don't mis-classify, but the real fix is here. Once a release of claude-agent-sdk-python ships without this teardown error, we will pin to that version and remove the workaround.

Affected SDK code path

  • claude_agent_sdk/_internal/client.py:142-145 (process_query)
  • claude_agent_sdk/_internal/query.py:666 (Query.close)
  • Indirect: anyio/_backends/_asyncio.py:461 (CancelScope.__exit__ -- defensive guard, not the bug)

What I'd like

Either:

  1. A fix in Query.close() (or wherever the _tg is entered) so the close runs on the entering task.
  2. Confirmation that this is known + tracked + planned (with a link), so I can reference it from the downstream tracking issue.
  3. Guidance on whether the published SDK is intended to be safely used from an async for consumer loop in cmd_start-style code; if there's a recommended pattern that avoids this entirely, I will adopt it.

Environment

  • Python 3.14
  • claude-agent-sdk (please confirm the affected versions; reproduced against whatever the current uv sync pulled on 2026-05-22 -- the consumer pins claude-agent-sdk>=0.1.48).
  • anyio (current resolved version on 2026-05-22).

Downstream tracking

This bug is mirrored in two downstream issues in caylent-solutions/devbench:

  • caylent-solutions/devbench#231 -- tracking issue, stays open until this SDK bug is fixed and our pin advances.
  • caylent-solutions/devbench#232 -- workaround implementation in the consumer; closes once the warning-filter lands and #231 stays open as the upstream tracker.

Happy to test a candidate fix or PR against my reproducer if useful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions