Skip to content

feat(kb-open): Deep Research 开放 API(start/SSE/status/cancel)#446

Open
ncw1992120 wants to merge 2 commits into
mateaix:devfrom
ncw1992120:feat/kb-open-research
Open

feat(kb-open): Deep Research 开放 API(start/SSE/status/cancel)#446
ncw1992120 wants to merge 2 commits into
mateaix:devfrom
ncw1992120:feat/kb-open-research

Conversation

@ncw1992120

Copy link
Copy Markdown
Contributor

Closes #443 · Part of #440 · Builds on #441 (P0-A)

改动

异步 Deep Research 开放 API。Research 是多步 LLM 管线(plan → retrieve+draft → compose),异步执行并通过 SSE 推送进度。

4 个端点

方法 路径 说明
POST /{kbId}/research 启动(返回 sessionId + streamUrl)
GET /{kbId}/research/{id}/stream SSE 进度流(?token= 给 EventSource)
GET /{kbId}/research/{id}/status 查询状态 / 最终报告
POST /{kbId}/research/{id}/cancel 取消运行中的会话

组件

  • KbOpenResearchController:4 个端点,@RequireKbScope("kb:search")
  • KbResearchSessionRegistry:内存会话追踪,记录 keyId 归属(调用方只能查询/取消自己的会话)

安全

  • R7(SSE 鉴权)?token= query param(KbOpenApiAuthFilter 已支持此 fallback,EventSource 无法设 Authorization 头)
  • 会话归属:status/cancel/stream 均校验 keyId 匹配
  • 取消校验:会话必须 RUNNING(否则 409)

复用

底层完全复用现有 WikiResearchService.research() + ChatStreamTracker(SSE 广播),不改动 research 管线。

测试

Tests run: 23, Failures: 0, Errors: 0 (P0-A 17 + research 6)
BUILD SUCCESS
  • KbResearchSessionRegistryTest(6):register/complete/fail/cancel 生命周期、cancel-on-completed no-op、unknown session 返回 empty

依赖

此 PR 包含 P0-A 的 cherry-pick。若 #444(P0-A)先合并,rebase 后只剩 research 的 3 个文件。

…authz

Implements the authentication backbone for the KB Open API (mateaix#441):
API key lifecycle, a permitAll-path filter that rejects (never
pass-through), per-key sliding-window rate limiting, and a centralized
@RequireKbScope interceptor for scope + KB-ownership checks.

Components:
- TokenHashUtil: shared SHA-256 hash kernel (A4), reusable by PAT later
- KbApiKeyService: mint/authenticate/revoke/update + multi-KB binding
  (R3: empty binding = zero access, not "all KBs")
- KbOpenApiAuthFilter: sole gatekeeper for /api/v1/open/kb/** (R1: must
  return 401, no pass-through); R2: per-key rate limit (429)
- KbApiKeyRateLimiter: sliding-window limiter (TriggerRateLimiter pattern)
- @RequireKbScope + KbScopeInterceptor: centralized authorization (A1),
  scope check + kbId ownership from path variable
- KbApiKeyAdminController: JWT-authenticated CRUD (list/create/detail/
  update/revoke), workspace-scoped
- V162 migration (h2/mysql/kingbase): mate_kb_api_key + _binding tables

Security:
- mck_ prefix (distinct from PAT mc_ and JWT eyJ)
- SHA-256 hash storage, plaintext shown once at creation
- prefix column (4 chars) for UI display only

Tests (17 new, all green):
- KbApiKeyServiceTest: R3 empty-binding rejection, auth round-trip,
  expired/disabled/wrong-prefix rejection, kb:* wildcard, revoke
- KbApiKeyRateLimiterTest: sliding window, per-key isolation, recovery

Closes mateaix#441
Implements the async Deep Research endpoint for the KB Open API (mateaix#443).
Research is a multi-step LLM pipeline (plan → retrieve+draft → compose)
that runs asynchronously and broadcasts progress via SSE.

Endpoints:
- POST /{kbId}/research                      start (returns sessionId + streamUrl)
- GET  /{kbId}/research/{id}/stream          SSE progress (?token= for EventSource)
- GET  /{kbId}/research/{id}/status          query status / final report
- POST /{kbId}/research/{id}/cancel          cancel running session

Components:
- KbOpenResearchController: 4 endpoints, @RequireKbScope("kb:search")
- KbResearchSessionRegistry: in-memory session tracking with keyId
  ownership (a caller can only query/cancel their own sessions)

Security:
- R7: SSE uses ?token= query param (KbOpenApiAuthFilter already supports
  this fallback for EventSource which can't set Authorization headers)
- Session ownership: status/cancel/stream all verify keyId match
- Cancel checks session is RUNNING (409 otherwise)

Reuses existing WikiResearchService.research() + ChatStreamTracker for
the actual research pipeline and SSE broadcasting.

Tests (6 new, all green):
- KbResearchSessionRegistryTest: register/complete/fail/cancel lifecycle,
  cancel-on-completed no-op, unknown session returns empty

Closes mateaix#443
@mateaix

mateaix commented Jun 28, 2026

Copy link
Copy Markdown
Owner

感谢 Deep Research 开放 API 🙏 鉴权这块做得很好:path 走 permitAll + KbOpenApiAuthFilter 单点 fail-closed,四个端点都带 @RequireKbScope("kb:search"),并且 jobId 的 IDOR 已经堵住——requireSessionOwnership 校验 session.keyId().equals(ctx.keyId()),A 既看不到也取消不了 B 的 session。SSE 管线(Utf8SseEmitter + 10min 超时 + onCompletion/onTimeout/onError detach)也正确。

但异步作业层有几个阻塞项,对一个公开且产生真实成本的端点很关键:

1. 取消并不会真正停止作业。
WikiResearchService.research() 是一条直通的同步流水线(plan → 并行 draft → compose),全程不查 streamTracker.isStopRequested(...)、也没有中断标志。cancel 端点只把 registry 状态翻成 CANCELLED 并广播 SSE 关闭,后台虚拟线程仍把 LLM/web 调用跑到底——「取消」既不省成本也不停算力。需要引入协作式取消信号(在各 research 阶段间检查停止标志,或持有 Future 并 interrupt)。

2. CANCELLED 会被 COMPLETED/FAILED 覆盖。
作业跑完后 sessionRegistry.complete(...)computeIfPresent)无条件改写状态,用户取消后再查 /status 会看到 completed 和完整报告。请让 CANCELLED 成为「粘性」终态,complete/fail 在已 CANCELLED 时 no-op。

3. session registry 无界增长(内存泄漏)。
KbResearchSessionRegistry.sessions 从不清理——无 TTL、无定时清理、完成也不移除,每个 session 活到 JVM 退出。请加 TTL/定时清理或容量上限。

4. 每个 key 没有在跑作业的并发上限(成本/DoS)。
filter 只限了「每分钟请求数」(默认 60/min,start 也被计入),但没限「同时在跑的 research 作业数」。一个 key 每分钟能拉起约 60 个多步 LLM 流水线、各自再 fan-out 子问题,全跑在无界的 newVirtualThreadPerTaskExecutor 上——这是公开端点最主要的成本爆炸/DoS 路径。建议加 per-key 在跑并发上限,超了返回 429。设计文档 §9/§10 自己也写了 research 要「带限流+计费」,token 计费(TokenUsageService)这里也还缺。

5. 内联全限定名SecurityConfig.javaWebMvcConfig.javaKbOpenResearchController.javanew java.util.LinkedHashMap<>())、测试里的 java.util.List.of(...),按规范改成 import + 简单名。

非阻塞: V162 迁移头注释写成 V161(且与 #437 撞号,配合 P0-A 顺延);research 复用 kb:search scope 但设计文档没给它分配 scope,建议补一行说明;kb-open-api-design.md 同样建议移出仓库根目录。

栈底 P0-A 改好后这个 PR rebase,并把上面 1–4 的作业生命周期/成本控制补上,我们再合并 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(kb-open): P1.5 Deep Research 开放(独立设计)

2 participants