[follow-up #179] claude-agent-sdk runtime multimodal — wire images via AsyncIterable<SDKUserMessage>

## Context

In #179 M5b 必改2-C (commit `b875a16` on `feat/179-feishu-agent-sdk-channel`), the claude-agent-sdk runtime now accepts `images?: string[]` in `processWithClaude` for symmetry with the codex / grok branches, but downgrades non-empty images to a text-only prompt + warn line (mirrors the Grok behavior already in the codebase).

The downgrade was chosen to keep M5b's blast radius scoped — the proper multimodal fix touches the SDK call shape itself.

## What this issue tracks

Real multimodal wiring for claude-agent-sdk: send images as actual content blocks the LLM can see.

### Sketch

`claude-agent-sdk` `query({prompt})` accepts `prompt: string | AsyncIterable<SDKUserMessage>`. Each `SDKUserMessage` carries a `MessageParam` (Anthropic spec) whose `content` can be an array of blocks including:

```ts
{ type: "image", source: { type: "base64", media_type: "image/png", data: "<b64>" } }
{ type: "text",  text: "..." }
```

`processWithClaude` would switch from `query({prompt: task})` to `query({prompt: asyncIter([{type:"user", message:{role:"user", content:[textBlock, ...imageBlocks]}}])})` when images are non-empty.

### Blocking work to verify first

1. **Vendor multimodal support**: every Anthropic-compat endpoint the wizard currently lists (intern / MiniMax / mimo / deepseek / Anthropic native) needs a verify pass — does it accept image blocks? deepseek-v4-pro via `https://api.deepseek.com/anthropic` is the user-facing concern; verify via real curl with a small base64 image before landing.
2. **Cross-runtime regression**: all current `processWithClaude` callers go through `think()`; flipping the prompt shape changes the SDK call signature for the *common* path, not just feishu. Need a verify pass on commhub-inbox + /loop wakes + standalone agent-node smoke.
3. **Memory / size**: base64-encoded image data lives in memory during the SDK call. Uploads are already capped at 12MB per file (#221) so a single image is bounded; a multi-image message scales linearly. Confirm no SDK-side limit hits.

### Acceptance criteria

- `processWithClaude(task, from, [imagePath])` with non-empty images → LLM sees the image and can describe / reason about it (verify with a "what's in this image?" probe).
- Existing text-only path byte-identical (`query({prompt: task})` shape preserved when `images` is empty).
- Per-vendor verification matrix (which vendors accept image blocks; warn-and-downgrade for vendors that don't).
- Refresh #179 quickstart doc — drop the "agent-sdk runtime currently sends text-only" caveat once landed.

## Refs

- #179 (Feishu channel umbrella)
- M5b commit `b875a16` (current warn-only impl)
- Related: Grok ACP `promptCapabilities.image=false` warn at `agent-node/src/cli.ts:1855` (same downgrade pattern, also a candidate for upgrade once Grok backend supports images)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[follow-up #179] claude-agent-sdk runtime multimodal — wire images via AsyncIterable<SDKUserMessage> #259

Context

What this issue tracks

Sketch

Blocking work to verify first

Acceptance criteria

Refs

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[follow-up #179] claude-agent-sdk runtime multimodal — wire images via AsyncIterable<SDKUserMessage> #259

Description

Context

What this issue tracks

Sketch

Blocking work to verify first

Acceptance criteria

Refs

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions