feat: run_recognition_direct & get_default_recognition_param #1069

MistEO · 2026-01-16T08:17:03Z

由 Sourcery 提供的摘要

为识别与动作添加直接执行 API，并在 core、agent、Python 和 NodeJS 绑定中暴露识别与动作类型的默认参数获取接口，并附带相应的文档与测试。

新功能：

引入 MaaContextRunRecognitionDirect 和 MaaContextRunActionDirect，可通过类型和参数直接运行识别与动作，而无需定义流水线条目。
暴露 MaaResourceGetDefaultRecognitionParam 和 MaaResourceGetDefaultActionParam，用于从默认的流水线管理器中获取内置识别与动作类型的默认参数。
为 Python 和 NodeJS 绑定添加方法，用于调用新的上下文直接执行 API，并从资源中获取识别与动作的默认参数。

增强内容：

扩展 RemoteContext 和 RemoteResource，以支持新的直接执行能力和默认参数获取能力。
更新 agent 消息与客户端处理逻辑，以在客户端与服务端之间传递直接执行和默认参数请求。
在集成接口概览文档中记录新的 C API，并更新贡献者指南，以说明如何添加新的识别/动作算法类型。

测试：

添加 Python 绑定和 agent 子测试，用于覆盖识别/动作的直接执行以及从资源中获取默认参数的相关场景。

Original summary in English

Summary by Sourcery

Add direct execution APIs for recognition and actions and expose default parameter retrieval for recognition and action types across core, agent, Python, and NodeJS bindings, with corresponding documentation and tests.

New Features:

Introduce MaaContextRunRecognitionDirect and MaaContextRunActionDirect to run recognition and actions directly by type and parameters without defining pipeline entries.
Expose MaaResourceGetDefaultRecognitionParam and MaaResourceGetDefaultActionParam to fetch default parameters for built-in recognition and action types from the default pipeline manager.
Add Python and NodeJS binding methods to call the new direct context execution APIs and to retrieve default recognition and action parameters from resources.

Enhancements:

Extend RemoteContext and RemoteResource to support the new direct execution and default-parameter retrieval capabilities.
Update agent messaging and client handlers to transport direct execution and default-parameter requests between client and server.
Document the new C APIs in the integrated interface overview and update contributor guidelines for adding new recognition/action algorithm types.

Tests:

Add Python binding and agent child tests covering direct recognition/action execution and default parameter retrieval from resources.

Original summary in English

由 Sourcery 提供的摘要

为识别与动作添加直接执行 API，并在 core、agent、Python 和 NodeJS 绑定中暴露识别与动作类型的默认参数获取接口，并附带相应的文档与测试。

新功能：

引入 MaaContextRunRecognitionDirect 和 MaaContextRunActionDirect，可通过类型和参数直接运行识别与动作，而无需定义流水线条目。
暴露 MaaResourceGetDefaultRecognitionParam 和 MaaResourceGetDefaultActionParam，用于从默认的流水线管理器中获取内置识别与动作类型的默认参数。
为 Python 和 NodeJS 绑定添加方法，用于调用新的上下文直接执行 API，并从资源中获取识别与动作的默认参数。

增强内容：

扩展 RemoteContext 和 RemoteResource，以支持新的直接执行能力和默认参数获取能力。
更新 agent 消息与客户端处理逻辑，以在客户端与服务端之间传递直接执行和默认参数请求。
在集成接口概览文档中记录新的 C API，并更新贡献者指南，以说明如何添加新的识别/动作算法类型。

测试：

添加 Python 绑定和 agent 子测试，用于覆盖识别/动作的直接执行以及从资源中获取默认参数的相关场景。

Original summary in English

Summary by Sourcery

Add direct execution APIs for recognition and actions and expose default parameter retrieval for recognition and action types across core, agent, Python, and NodeJS bindings, with corresponding documentation and tests.

New Features:

Introduce MaaContextRunRecognitionDirect and MaaContextRunActionDirect to run recognition and actions directly by type and parameters without defining pipeline entries.
Expose MaaResourceGetDefaultRecognitionParam and MaaResourceGetDefaultActionParam to fetch default parameters for built-in recognition and action types from the default pipeline manager.
Add Python and NodeJS binding methods to call the new direct context execution APIs and to retrieve default recognition and action parameters from resources.

Enhancements:

Extend RemoteContext and RemoteResource to support the new direct execution and default-parameter retrieval capabilities.
Update agent messaging and client handlers to transport direct execution and default-parameter requests between client and server.
Document the new C APIs in the integrated interface overview and update contributor guidelines for adding new recognition/action algorithm types.

Tests:

Add Python binding and agent child tests covering direct recognition/action execution and default parameter retrieval from resources.

sourcery-ai

Hey - 我发现了 5 个问题，并给出了一些整体性的反馈：

在 NodeJS 绑定中（ContextImpl::run_recognition_direct / run_action_direct），maajs::EnvType 参数在函数定义中没有命名，但在函数体中却使用了 env（例如 maajs::JsonStringify(env, ...)），这会导致无法通过编译；请在函数签名中给该参数命名，或者在函数体中停止使用 env。
在 Context::run_recognition_direct / run_action_direct 中使用了 std::format，但 Context.cpp 并未显式包含 <format>；建议在此处添加该头文件，避免依赖传递式包含以及潜在的构建问题。

给 AI Agent 的提示词

Please address the comments from this code review:

## Overall Comments
- In the NodeJS binding (`ContextImpl::run_recognition_direct` / `run_action_direct`), the `maajs::EnvType` parameter is unnamed in the definition but `env` is used in the body (e.g. in `maajs::JsonStringify(env, ...)`), which will not compile; give the parameter a name in the signature or stop using `env` inside.
- In `Context::run_recognition_direct` / `run_action_direct` you use `std::format` but `Context.cpp` does not include `<format>` explicitly; consider adding the header here to avoid relying on transitive includes and potential build issues.

## Individual Comments

### Comment 1
<location> `source/binding/NodeJS/src/apis/context.cpp:76-85` </location>
<code_context>
     });
 }

+maajs::PromiseType ContextImpl::run_recognition_direct(
+    maajs::ValueType self,
+    maajs::EnvType,
+    std::string reco_type,
+    maajs::ValueType reco_param,
+    maajs::ArrayBufferType image)
+{
+    auto buf = std::make_shared<ImageBuffer>();
+    buf->set(image);
+    auto param_str = maajs::JsonStringify(env, reco_param);
+    auto worker = new maajs::AsyncWork<MaaRecoId>(env, [context = context, reco_type, param_str, buf]() {
+        return MaaContextRunRecognitionDirect(context, reco_type.c_str(), param_str.c_str(), *buf);
+    });
</code_context>

<issue_to_address>
**issue (bug_risk):** The `env` parameter is unnamed but still used, causing a compilation error.

In both `run_recognition_direct` and `run_action_direct`, the parameter list declares `maajs::EnvType` without a name, but `env` is used inside (e.g., in `maajs::JsonStringify(env, ...)` and `maajs::AsyncWork`). This will not compile. Please give the parameter a name (e.g., `maajs::EnvType env`) or obtain the environment consistently with the other methods in this file.
</issue_to_address>

### Comment 2
<location> `source/binding/NodeJS/src/apis/context.cpp:97-106` </location>
<code_context>
+    });
+}
+
+maajs::PromiseType ContextImpl::run_action_direct(
+    maajs::ValueType self,
+    maajs::EnvType,
+    std::string action_type,
+    maajs::ValueType action_param,
+    MaaRect box,
+    std::string reco_detail)
+{
+    auto param_str = maajs::JsonStringify(env, action_param);
+    auto worker = new maajs::AsyncWork<MaaActId>(env, [context = context, action_type, param_str, box, reco_detail]() {
+        return MaaContextRunActionDirect(context, action_type.c_str(), param_str.c_str(), &box, reco_detail.c_str());
+    });
</code_context>

<issue_to_address>
**issue (bug_risk):** The same `env` naming issue appears in `run_action_direct` and needs to be fixed there as well.

Here `maajs::EnvType` is declared without a name, but `env` is used in the body. Add the parameter name (e.g., `maajs::EnvType env`) and ensure usage matches, otherwise this won’t compile.
</issue_to_address>

### Comment 3
<location> `test/python/binding_test.py:118-135` </location>
<code_context>
         reco_detail = context.run_recognition(entry, argv.image, ppover)
         print(f"  reco_detail: {reco_detail}")

+        # 测试 run_recognition_direct
+        reco_direct_detail = context.run_recognition_direct(
+            JRecognitionType.OCR, JOCR(), argv.image
+        )
+        print(f"  reco_direct_detail: {reco_direct_detail}")
+
+        # 测试 run_action_direct
+        action_direct_detail = context.run_action_direct(
+            JActionType.Click, JClick(), (100, 100, 50, 50), ""
+        )
+        print(f"  action_direct_detail: {action_direct_detail}")
+
         # 测试 clone 和 override
</code_context>

<issue_to_address>
**suggestion (testing):** Add assertions for `run_recognition_direct` / `run_action_direct` instead of only printing results

Right now these calls only print their results, so the tests don't actually validate behavior. Please add assertions that:
- The returned detail objects are non-`None` for valid inputs.
- `reco_direct_detail.hit` / `action_direct_detail.success` match the expected outcome for the test data.
Also add at least one negative-case assertion (e.g., invalid type or empty image) to confirm these methods return `None` when the underlying C API fails to start the flow.

```suggestion
        reco_detail = context.run_recognition(entry, argv.image, ppover)
        print(f"  reco_detail: {reco_detail}")

        # 测试 run_recognition_direct
        reco_direct_detail = context.run_recognition_direct(
            JRecognitionType.OCR, JOCR(), argv.image
        )
        print(f"  reco_direct_detail: {reco_direct_detail}")
        # 断言 run_recognition_direct 返回非 None，且与 run_recognition 结果一致
        assert reco_direct_detail is not None, "run_recognition_direct should return detail for valid input"
        # 对比 direct 和 pipeline 识别结果的一致性，而不是硬编码期望
        assert getattr(reco_direct_detail, "hit", None) == getattr(
            reco_detail, "hit", None
        ), "run_recognition_direct.hit should match run_recognition.hit for the same input"

        # 测试 run_action_direct
        action_direct_detail = context.run_action_direct(
            JActionType.Click, JClick(), (100, 100, 50, 50), ""
        )
        print(f"  action_direct_detail: {action_direct_detail}")
        # 断言 run_action_direct 返回非 None，并且动作执行成功
        assert action_direct_detail is not None, "run_action_direct should return detail for valid input"
        assert getattr(
            action_direct_detail, "success", False
        ), "run_action_direct.success should be True for a valid click action"

        # negative cases: 确认底层 C API 启动失败时返回 None
        bad_reco_detail = context.run_recognition_direct(
            JRecognitionType.OCR, JOCR(), ""  # 空路径/无效图片
        )
        assert (
            bad_reco_detail is None
        ), "run_recognition_direct should return None when given an invalid/empty image path"

        bad_action_detail = context.run_action_direct(
            JActionType.Click, JClick(), (0, 0, 0, 0), ""  # 无效矩形参数
        )
        assert (
            bad_action_detail is None
        ), "run_action_direct should return None when the action cannot be started due to invalid parameters"

        # 测试 clone 和 override
        new_ctx = context.clone()
        new_ctx.override_pipeline({"TaskA": {}, "TaskB": {}})
```
</issue_to_address>

### Comment 4
<location> `test/agent/agent_child_test.py:107-116` </location>
<code_context>
         reco_detail = context.run_recognition(entry, argv.image, ppover)
         print(f"  reco_detail: {reco_detail}")

+        # 测试 run_recognition_direct
+        reco_direct_detail = context.run_recognition_direct(
+            JRecognitionType.OCR, JOCR(), argv.image
+        )
+        print(f"  reco_direct_detail: {reco_direct_detail}")
+
+        # 测试 run_action_direct
+        action_direct_detail = context.run_action_direct(
+            JActionType.Click, JClick(), (100, 100, 50, 50), ""
+        )
+        print(f"  action_direct_detail: {action_direct_detail}")
+
         # 测试 clone 和 override
</code_context>

<issue_to_address>
**suggestion (testing):** Assert agent-side behavior for direct run APIs, not just that calls succeed

Right now this test only prints the results of `run_recognition_direct` and `run_action_direct`, so it won’t fail if the agent behavior regresses. Please add assertions that:
- `reco_direct_detail` / `action_direct_detail` are non-`None` for valid inputs.
- `reco_direct_detail.hit` and `action_direct_detail.success` match the expected outcome.
Also consider a negative-path case (e.g., invalid type) to verify the agent returns `None` when the server fails to start the recognition/action.

Suggested implementation:

```python
        # 测试 run_recognition_direct
        reco_direct_detail = context.run_recognition_direct(
            JRecognitionType.OCR, JOCR(), argv.image
        )
        print(f"  reco_direct_detail: {reco_direct_detail}")
        # 断言直连识别的返回值和行为
        assert reco_direct_detail is not None, "run_recognition_direct should return a detail object for valid input"
        assert hasattr(
            reco_direct_detail, "hit"
        ), "reco_direct_detail should have a 'hit' attribute"
        # 与正常 pipeline 识别结果保持一致，防止代理实现行为回退
        assert (
            reco_direct_detail.hit == reco_detail.hit
        ), "Direct recognition hit flag should match pipeline recognition"

        # 测试 run_action_direct
        action_direct_detail = context.run_action_direct(
            JActionType.Click, JClick(), (100, 100, 50, 50), ""
        )
        print(f"  action_direct_detail: {action_direct_detail}")
        # 断言直连动作的返回值和行为
        assert action_direct_detail is not None, "run_action_direct should return a detail object for valid input"
        assert hasattr(
            action_direct_detail, "success"
        ), "action_direct_detail should have a 'success' attribute"
        assert (
            action_direct_detail.success is True
        ), "Direct click action should succeed for a valid region"

        # 负路径：使用明显无效的输入，确保直连接口在代理无法启动识别/动作时返回 None
        invalid_reco_direct_detail = context.run_recognition_direct(
            JRecognitionType.OCR, JOCR(), "this_image_does_not_exist.png"
        )
        assert (
            invalid_reco_direct_detail is None
        ), "run_recognition_direct should return None when recognition cannot be started (invalid image/input)"

        invalid_action_direct_detail = context.run_action_direct(
            JActionType.Click, JClick(), (0, 0, 0, 0), ""
        )
        # 空区域视作无效，代理应当无法执行该动作
        assert (
            invalid_action_direct_detail is None
        ), "run_action_direct should return None when action cannot be started (invalid region/input)"

```

Depending on the actual semantics of `run_recognition_direct` / `run_action_direct`, you may need to tweak the negative-path expectations:

1. If the implementation raises exceptions instead of returning `None` for invalid inputs, wrap the negative calls in `with pytest.raises(...)` instead of asserting `None`.
2. If a different kind of invalid input is required to trigger the "server fails to start" path (e.g., invalid `JRecognitionType` / `JActionType` or an invalid `context` state), adjust the two negative tests to use that condition while keeping the core assertion that the agent returns `None` (or raises a specific exception) in that case.
3. If `action_direct_detail` uses a different field name than `success` (e.g., `ok`, `status`), update the attribute access accordingly.
</issue_to_address>

### Comment 5
<location> `docs/en_us/2.2-IntegratedInterfaceOverview.md:727` </location>
<code_context>

 > Will not execute subsequent next steps.

+### MaaContextRunRecognitionDirect
</code_context>

<issue_to_address>
**suggestion (typo):** Avoid redundant phrasing in "subsequent next steps."

Use either “subsequent steps” or “next steps” to avoid the redundant phrasing.

```suggestion
> Will not execute subsequent steps.
```
</issue_to_address>

Sourcery 对开源项目免费使用——如果你觉得这次 Review 有帮助，欢迎分享 ✨

_{帮助我变得更有用！请在每条评论上点击 👍 或 👎，我会根据你的反馈改进后续的 Review。}

Original comment in English

Hey - I've found 5 issues, and left some high level feedback:

In the NodeJS binding (ContextImpl::run_recognition_direct / run_action_direct), the maajs::EnvType parameter is unnamed in the definition but env is used in the body (e.g. in maajs::JsonStringify(env, ...)), which will not compile; give the parameter a name in the signature or stop using env inside.
In Context::run_recognition_direct / run_action_direct you use std::format but Context.cpp does not include <format> explicitly; consider adding the header here to avoid relying on transitive includes and potential build issues.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In the NodeJS binding (`ContextImpl::run_recognition_direct` / `run_action_direct`), the `maajs::EnvType` parameter is unnamed in the definition but `env` is used in the body (e.g. in `maajs::JsonStringify(env, ...)`), which will not compile; give the parameter a name in the signature or stop using `env` inside.
- In `Context::run_recognition_direct` / `run_action_direct` you use `std::format` but `Context.cpp` does not include `<format>` explicitly; consider adding the header here to avoid relying on transitive includes and potential build issues.

## Individual Comments

### Comment 1
<location> `source/binding/NodeJS/src/apis/context.cpp:76-85` </location>
<code_context>
     });
 }

+maajs::PromiseType ContextImpl::run_recognition_direct(
+    maajs::ValueType self,
+    maajs::EnvType,
+    std::string reco_type,
+    maajs::ValueType reco_param,
+    maajs::ArrayBufferType image)
+{
+    auto buf = std::make_shared<ImageBuffer>();
+    buf->set(image);
+    auto param_str = maajs::JsonStringify(env, reco_param);
+    auto worker = new maajs::AsyncWork<MaaRecoId>(env, [context = context, reco_type, param_str, buf]() {
+        return MaaContextRunRecognitionDirect(context, reco_type.c_str(), param_str.c_str(), *buf);
+    });
</code_context>

<issue_to_address>
**issue (bug_risk):** The `env` parameter is unnamed but still used, causing a compilation error.

In both `run_recognition_direct` and `run_action_direct`, the parameter list declares `maajs::EnvType` without a name, but `env` is used inside (e.g., in `maajs::JsonStringify(env, ...)` and `maajs::AsyncWork`). This will not compile. Please give the parameter a name (e.g., `maajs::EnvType env`) or obtain the environment consistently with the other methods in this file.
</issue_to_address>

### Comment 2
<location> `source/binding/NodeJS/src/apis/context.cpp:97-106` </location>
<code_context>
+    });
+}
+
+maajs::PromiseType ContextImpl::run_action_direct(
+    maajs::ValueType self,
+    maajs::EnvType,
+    std::string action_type,
+    maajs::ValueType action_param,
+    MaaRect box,
+    std::string reco_detail)
+{
+    auto param_str = maajs::JsonStringify(env, action_param);
+    auto worker = new maajs::AsyncWork<MaaActId>(env, [context = context, action_type, param_str, box, reco_detail]() {
+        return MaaContextRunActionDirect(context, action_type.c_str(), param_str.c_str(), &box, reco_detail.c_str());
+    });
</code_context>

<issue_to_address>
**issue (bug_risk):** The same `env` naming issue appears in `run_action_direct` and needs to be fixed there as well.

Here `maajs::EnvType` is declared without a name, but `env` is used in the body. Add the parameter name (e.g., `maajs::EnvType env`) and ensure usage matches, otherwise this won’t compile.
</issue_to_address>

### Comment 3
<location> `test/python/binding_test.py:118-135` </location>
<code_context>
         reco_detail = context.run_recognition(entry, argv.image, ppover)
         print(f"  reco_detail: {reco_detail}")

+        # 测试 run_recognition_direct
+        reco_direct_detail = context.run_recognition_direct(
+            JRecognitionType.OCR, JOCR(), argv.image
+        )
+        print(f"  reco_direct_detail: {reco_direct_detail}")
+
+        # 测试 run_action_direct
+        action_direct_detail = context.run_action_direct(
+            JActionType.Click, JClick(), (100, 100, 50, 50), ""
+        )
+        print(f"  action_direct_detail: {action_direct_detail}")
+
         # 测试 clone 和 override
</code_context>

<issue_to_address>
**suggestion (testing):** Add assertions for `run_recognition_direct` / `run_action_direct` instead of only printing results

Right now these calls only print their results, so the tests don't actually validate behavior. Please add assertions that:
- The returned detail objects are non-`None` for valid inputs.
- `reco_direct_detail.hit` / `action_direct_detail.success` match the expected outcome for the test data.
Also add at least one negative-case assertion (e.g., invalid type or empty image) to confirm these methods return `None` when the underlying C API fails to start the flow.

```suggestion
        reco_detail = context.run_recognition(entry, argv.image, ppover)
        print(f"  reco_detail: {reco_detail}")

        # 测试 run_recognition_direct
        reco_direct_detail = context.run_recognition_direct(
            JRecognitionType.OCR, JOCR(), argv.image
        )
        print(f"  reco_direct_detail: {reco_direct_detail}")
        # 断言 run_recognition_direct 返回非 None，且与 run_recognition 结果一致
        assert reco_direct_detail is not None, "run_recognition_direct should return detail for valid input"
        # 对比 direct 和 pipeline 识别结果的一致性，而不是硬编码期望
        assert getattr(reco_direct_detail, "hit", None) == getattr(
            reco_detail, "hit", None
        ), "run_recognition_direct.hit should match run_recognition.hit for the same input"

        # 测试 run_action_direct
        action_direct_detail = context.run_action_direct(
            JActionType.Click, JClick(), (100, 100, 50, 50), ""
        )
        print(f"  action_direct_detail: {action_direct_detail}")
        # 断言 run_action_direct 返回非 None，并且动作执行成功
        assert action_direct_detail is not None, "run_action_direct should return detail for valid input"
        assert getattr(
            action_direct_detail, "success", False
        ), "run_action_direct.success should be True for a valid click action"

        # negative cases: 确认底层 C API 启动失败时返回 None
        bad_reco_detail = context.run_recognition_direct(
            JRecognitionType.OCR, JOCR(), ""  # 空路径/无效图片
        )
        assert (
            bad_reco_detail is None
        ), "run_recognition_direct should return None when given an invalid/empty image path"

        bad_action_detail = context.run_action_direct(
            JActionType.Click, JClick(), (0, 0, 0, 0), ""  # 无效矩形参数
        )
        assert (
            bad_action_detail is None
        ), "run_action_direct should return None when the action cannot be started due to invalid parameters"

        # 测试 clone 和 override
        new_ctx = context.clone()
        new_ctx.override_pipeline({"TaskA": {}, "TaskB": {}})
```
</issue_to_address>

### Comment 4
<location> `test/agent/agent_child_test.py:107-116` </location>
<code_context>
         reco_detail = context.run_recognition(entry, argv.image, ppover)
         print(f"  reco_detail: {reco_detail}")

+        # 测试 run_recognition_direct
+        reco_direct_detail = context.run_recognition_direct(
+            JRecognitionType.OCR, JOCR(), argv.image
+        )
+        print(f"  reco_direct_detail: {reco_direct_detail}")
+
+        # 测试 run_action_direct
+        action_direct_detail = context.run_action_direct(
+            JActionType.Click, JClick(), (100, 100, 50, 50), ""
+        )
+        print(f"  action_direct_detail: {action_direct_detail}")
+
         # 测试 clone 和 override
</code_context>

<issue_to_address>
**suggestion (testing):** Assert agent-side behavior for direct run APIs, not just that calls succeed

Right now this test only prints the results of `run_recognition_direct` and `run_action_direct`, so it won’t fail if the agent behavior regresses. Please add assertions that:
- `reco_direct_detail` / `action_direct_detail` are non-`None` for valid inputs.
- `reco_direct_detail.hit` and `action_direct_detail.success` match the expected outcome.
Also consider a negative-path case (e.g., invalid type) to verify the agent returns `None` when the server fails to start the recognition/action.

Suggested implementation:

```python
        # 测试 run_recognition_direct
        reco_direct_detail = context.run_recognition_direct(
            JRecognitionType.OCR, JOCR(), argv.image
        )
        print(f"  reco_direct_detail: {reco_direct_detail}")
        # 断言直连识别的返回值和行为
        assert reco_direct_detail is not None, "run_recognition_direct should return a detail object for valid input"
        assert hasattr(
            reco_direct_detail, "hit"
        ), "reco_direct_detail should have a 'hit' attribute"
        # 与正常 pipeline 识别结果保持一致，防止代理实现行为回退
        assert (
            reco_direct_detail.hit == reco_detail.hit
        ), "Direct recognition hit flag should match pipeline recognition"

        # 测试 run_action_direct
        action_direct_detail = context.run_action_direct(
            JActionType.Click, JClick(), (100, 100, 50, 50), ""
        )
        print(f"  action_direct_detail: {action_direct_detail}")
        # 断言直连动作的返回值和行为
        assert action_direct_detail is not None, "run_action_direct should return a detail object for valid input"
        assert hasattr(
            action_direct_detail, "success"
        ), "action_direct_detail should have a 'success' attribute"
        assert (
            action_direct_detail.success is True
        ), "Direct click action should succeed for a valid region"

        # 负路径：使用明显无效的输入，确保直连接口在代理无法启动识别/动作时返回 None
        invalid_reco_direct_detail = context.run_recognition_direct(
            JRecognitionType.OCR, JOCR(), "this_image_does_not_exist.png"
        )
        assert (
            invalid_reco_direct_detail is None
        ), "run_recognition_direct should return None when recognition cannot be started (invalid image/input)"

        invalid_action_direct_detail = context.run_action_direct(
            JActionType.Click, JClick(), (0, 0, 0, 0), ""
        )
        # 空区域视作无效，代理应当无法执行该动作
        assert (
            invalid_action_direct_detail is None
        ), "run_action_direct should return None when action cannot be started (invalid region/input)"

```

Depending on the actual semantics of `run_recognition_direct` / `run_action_direct`, you may need to tweak the negative-path expectations:

1. If the implementation raises exceptions instead of returning `None` for invalid inputs, wrap the negative calls in `with pytest.raises(...)` instead of asserting `None`.
2. If a different kind of invalid input is required to trigger the "server fails to start" path (e.g., invalid `JRecognitionType` / `JActionType` or an invalid `context` state), adjust the two negative tests to use that condition while keeping the core assertion that the agent returns `None` (or raises a specific exception) in that case.
3. If `action_direct_detail` uses a different field name than `success` (e.g., `ok`, `status`), update the attribute access accordingly.
</issue_to_address>

### Comment 5
<location> `docs/en_us/2.2-IntegratedInterfaceOverview.md:727` </location>
<code_context>

 > Will not execute subsequent next steps.

+### MaaContextRunRecognitionDirect
</code_context>

<issue_to_address>
**suggestion (typo):** Avoid redundant phrasing in "subsequent next steps."

Use either “subsequent steps” or “next steps” to avoid the redundant phrasing.

```suggestion
> Will not execute subsequent steps.
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2026-01-20T14:44:14Z

source/binding/NodeJS/src/apis/context.cpp

+maajs::PromiseType ContextImpl::run_recognition_direct(
+    maajs::ValueType self,
+    maajs::EnvType,
+    std::string reco_type,
+    maajs::ValueType reco_param,
+    maajs::ArrayBufferType image)
+{
+    auto buf = std::make_shared<ImageBuffer>();
+    buf->set(image);
+    auto param_str = maajs::JsonStringify(env, reco_param);


issue (bug_risk): env 参数未命名却仍在使用，这会导致编译错误。

在 run_recognition_direct 和 run_action_direct 中，参数列表里声明了未命名的 maajs::EnvType，但在函数体中使用了 env（例如 maajs::JsonStringify(env, ...) 和 maajs::AsyncWork）。这样无法通过编译。请为该参数添加名称（例如 maajs::EnvType env），或者像本文件中的其他方法一样，以一致的方式获取环境对象。

Original comment in English

issue (bug_risk): The env parameter is unnamed but still used, causing a compilation error.

In both run_recognition_direct and run_action_direct, the parameter list declares maajs::EnvType without a name, but env is used inside (e.g., in maajs::JsonStringify(env, ...) and maajs::AsyncWork). This will not compile. Please give the parameter a name (e.g., maajs::EnvType env) or obtain the environment consistently with the other methods in this file.

sourcery-ai · 2026-01-20T14:44:14Z

source/binding/NodeJS/src/apis/context.cpp

+maajs::PromiseType ContextImpl::run_action_direct(
+    maajs::ValueType self,
+    maajs::EnvType,
+    std::string action_type,
+    maajs::ValueType action_param,
+    MaaRect box,
+    std::string reco_detail)
+{
+    auto param_str = maajs::JsonStringify(env, action_param);
+    auto worker = new maajs::AsyncWork<MaaActId>(env, [context = context, action_type, param_str, box, reco_detail]() {


issue (bug_risk): 相同的 env 命名问题也出现在 run_action_direct 中，需要一并修复。

这里 maajs::EnvType 也是未命名的，但函数体中使用了 env。请添加参数名（例如 maajs::EnvType env），并确保使用方式一致，否则同样无法编译。

Original comment in English

issue (bug_risk): The same env naming issue appears in run_action_direct and needs to be fixed there as well.

Here maajs::EnvType is declared without a name, but env is used in the body. Add the parameter name (e.g., maajs::EnvType env) and ensure usage matches, otherwise this won’t compile.

sourcery-ai · 2026-01-20T14:44:14Z

test/agent/agent_child_test.py

+        # 测试 run_recognition_direct
+        reco_direct_detail = context.run_recognition_direct(
+            JRecognitionType.OCR, JOCR(), argv.image
+        )
+        print(f"  reco_direct_detail: {reco_direct_detail}")
+
+        # 测试 run_action_direct
+        action_direct_detail = context.run_action_direct(
+            JActionType.Click, JClick(), (100, 100, 50, 50), ""
+        )


suggestion (testing): 针对直连运行的 API 断言代理侧行为，而不仅仅是调用成功

目前这个测试只打印了 run_recognition_direct 和 run_action_direct 的结果，如果代理行为回归，测试也不会失败。请补充以下断言：

对于合法输入，reco_direct_detail / action_direct_detail 不为 None。

reco_direct_detail.hit 和 action_direct_detail.success 与预期结果一致。
另外建议增加一条负路径测试（例如无效类型），以验证当服务端无法启动识别/动作时，代理会返回 None。

建议实现如下：

# 测试 run_recognition_direct reco_direct_detail = context.run_recognition_direct( JRecognitionType.OCR, JOCR(), argv.image ) print(f" reco_direct_detail: {reco_direct_detail}") # 断言直连识别的返回值和行为 assert reco_direct_detail is not None, "run_recognition_direct should return a detail object for valid input" assert hasattr( reco_direct_detail, "hit" ), "reco_direct_detail should have a 'hit' attribute" # 与正常 pipeline 识别结果保持一致，防止代理实现行为回退 assert ( reco_direct_detail.hit == reco_detail.hit ), "Direct recognition hit flag should match pipeline recognition" # 测试 run_action_direct action_direct_detail = context.run_action_direct( JActionType.Click, JClick(), (100, 100, 50, 50), "" ) print(f" action_direct_detail: {action_direct_detail}") # 断言直连动作的返回值和行为 assert action_direct_detail is not None, "run_action_direct should return a detail object for valid input" assert hasattr( action_direct_detail, "success" ), "action_direct_detail should have a 'success' attribute" assert ( action_direct_detail.success is True ), "Direct click action should succeed for a valid region" # 负路径：使用明显无效的输入，确保直连接口在代理无法启动识别/动作时返回 None invalid_reco_direct_detail = context.run_recognition_direct( JRecognitionType.OCR, JOCR(), "this_image_does_not_exist.png" ) assert ( invalid_reco_direct_detail is None ), "run_recognition_direct should return None when recognition cannot be started (invalid image/input)" invalid_action_direct_detail = context.run_action_direct( JActionType.Click, JClick(), (0, 0, 0, 0), "" ) # 空区域视作无效，代理应当无法执行该动作 assert ( invalid_action_direct_detail is None ), "run_action_direct should return None when action cannot be started (invalid region/input)"

根据 run_recognition_direct / run_action_direct 实际的语义，你可能需要对负路径的预期稍作调整：

如果实现对无效输入抛出异常而不是返回 None，请使用 with pytest.raises(...) 包裹负路径调用，而不是断言返回 None。

如果需要其他类型的无效输入才能触发“服务端启动失败”路径（例如无效的 JRecognitionType / JActionType，或无效的 context 状态），请据此调整这两个负路径测试，同时保持核心断言：在这种情况下代理会返回 None（或抛出特定异常）。

如果 action_direct_detail 使用的字段名不是 success（例如 ok、status），请相应更新属性访问。

Original comment in English

suggestion (testing): Assert agent-side behavior for direct run APIs, not just that calls succeed

Right now this test only prints the results of run_recognition_direct and run_action_direct, so it won’t fail if the agent behavior regresses. Please add assertions that:

reco_direct_detail / action_direct_detail are non-None for valid inputs.

reco_direct_detail.hit and action_direct_detail.success match the expected outcome.
Also consider a negative-path case (e.g., invalid type) to verify the agent returns None when the server fails to start the recognition/action.

Suggested implementation:

# 测试 run_recognition_direct reco_direct_detail = context.run_recognition_direct( JRecognitionType.OCR, JOCR(), argv.image ) print(f" reco_direct_detail: {reco_direct_detail}") # 断言直连识别的返回值和行为 assert reco_direct_detail is not None, "run_recognition_direct should return a detail object for valid input" assert hasattr( reco_direct_detail, "hit" ), "reco_direct_detail should have a 'hit' attribute" # 与正常 pipeline 识别结果保持一致，防止代理实现行为回退 assert ( reco_direct_detail.hit == reco_detail.hit ), "Direct recognition hit flag should match pipeline recognition" # 测试 run_action_direct action_direct_detail = context.run_action_direct( JActionType.Click, JClick(), (100, 100, 50, 50), "" ) print(f" action_direct_detail: {action_direct_detail}") # 断言直连动作的返回值和行为 assert action_direct_detail is not None, "run_action_direct should return a detail object for valid input" assert hasattr( action_direct_detail, "success" ), "action_direct_detail should have a 'success' attribute" assert ( action_direct_detail.success is True ), "Direct click action should succeed for a valid region" # 负路径：使用明显无效的输入，确保直连接口在代理无法启动识别/动作时返回 None invalid_reco_direct_detail = context.run_recognition_direct( JRecognitionType.OCR, JOCR(), "this_image_does_not_exist.png" ) assert ( invalid_reco_direct_detail is None ), "run_recognition_direct should return None when recognition cannot be started (invalid image/input)" invalid_action_direct_detail = context.run_action_direct( JActionType.Click, JClick(), (0, 0, 0, 0), "" ) # 空区域视作无效，代理应当无法执行该动作 assert ( invalid_action_direct_detail is None ), "run_action_direct should return None when action cannot be started (invalid region/input)"

Depending on the actual semantics of run_recognition_direct / run_action_direct, you may need to tweak the negative-path expectations:

If the implementation raises exceptions instead of returning None for invalid inputs, wrap the negative calls in with pytest.raises(...) instead of asserting None.

If a different kind of invalid input is required to trigger the "server fails to start" path (e.g., invalid JRecognitionType / JActionType or an invalid context state), adjust the two negative tests to use that condition while keeping the core assertion that the agent returns None (or raises a specific exception) in that case.

If action_direct_detail uses a different field name than success (e.g., ok, status), update the attribute access accordingly.

sourcery-ai · 2026-01-20T14:44:14Z

docs/en_us/2.2-IntegratedInterfaceOverview.md


 Synchronously execute action logic for `entry`, returns action id. Returns `MaaInvalidId` on failure, or action id on success. You can get action details via `MaaTaskerGetActionDetail`.

 > Will not execute subsequent next steps.


suggestion (typo): 避免使用“subsequent next steps”这种重复表述。

建议只使用 “subsequent steps” 或 “next steps” 之一，以避免措辞冗余。

Suggested change

> Will not execute subsequent next steps.

> Will not execute subsequent steps.

Original comment in English

suggestion (typo): Avoid redundant phrasing in "subsequent next steps."

Use either “subsequent steps” or “next steps” to avoid the redundant phrasing.

Suggested change

> Will not execute subsequent next steps.

> Will not execute subsequent steps.

Copilot

Pull request overview

This PR introduces direct recognition and action execution APIs that bypass pipeline entries, along with methods to retrieve default parameters for recognition and action types from the DefaultPipelineMgr.

Changes:

Added MaaContextRunRecognitionDirect and MaaContextRunActionDirect C APIs with corresponding implementations across Python, NodeJS, and agent protocol bindings
Added MaaResourceGetDefaultRecognitionParam and MaaResourceGetDefaultActionParam C APIs with full binding support
Extended agent protocol with new message types to support cross-process communication for the new APIs

Reviewed changes

Copilot reviewed 28 out of 30 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
include/MaaFramework/Instance/MaaContext.h	Added C API declarations for direct recognition/action execution
include/MaaFramework/Instance/MaaResource.h	Added C API declarations for default parameter retrieval
source/Common/MaaContext.cpp	Implemented C API functions with proper validation and JSON parsing
source/Common/MaaResource.cpp	Implemented C API functions for default parameter retrieval
source/MaaFramework/Task/Context.cpp	Core implementation creating pipeline overrides with UUIDs for direct execution
source/MaaFramework/Resource/ResourceMgr.cpp	Comprehensive switch statements covering all recognition and action types for default parameter retrieval
source/include/Common/MaaTypes.h	Added pure virtual methods to MaaResource and MaaContext interfaces
source/include/MaaAgent/Message.hpp	Defined new message structures for agent protocol communication
source/MaaAgentServer/RemoteInstance/RemoteContext.cpp	Implemented remote context methods forwarding requests to agent client
source/MaaAgentServer/RemoteInstance/RemoteResource.cpp	Implemented remote resource methods for parameter retrieval
source/MaaAgentClient/Client/AgentClient.cpp	Added request handlers for new context and resource operations
source/binding/Python/maa/context.py	Exposed direct execution methods with proper parameter serialization using dataclasses
source/binding/Python/maa/resource.py	Added methods to retrieve and parse default parameters
source/binding/NodeJS/src/apis/context.cpp	Implemented async wrappers for direct execution methods
source/binding/NodeJS/src/apis/resource.cpp	Added synchronous parameter retrieval methods
source/binding/NodeJS/src/apis/context.d.ts	TypeScript type definitions for new context methods
source/binding/NodeJS/src/apis/resource.d.ts	TypeScript type definitions for new resource methods
source/modules/MaaFramework.cppm	Exported new C API functions in C++20 module
test/python/binding_test.py	Added tests for direct execution and default parameter retrieval
test/agent/agent_child_test.py	Verified agent protocol support for new APIs
docs/en_us/2.2-IntegratedInterfaceOverview.md	Documented new APIs with parameter descriptions and usage notes
AGENTS.md	Updated development guide to mention ResourceMgr.cpp updates for new types

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-20T14:44:20Z

source/binding/Python/maa/context.py

@@ -1,4 +1,5 @@
 import ctypes
+import dataclasses


Module 'dataclasses' is imported with both 'import' and 'import from'.

Suggested change

import dataclasses

MistEO added 3 commits January 16, 2026 16:16

feat: run_recognition_direct & get_default_recognition_param

1fdb1f4

fix: build error

b2c18b3

fix(resource): 修复 ResourceMgr.cpp 中的命名空间缺失及变体调用编译错误。

70767f8

MistEO changed the base branch from feat/55x to main January 20, 2026 14:39

MistEO marked this pull request as ready for review January 20, 2026 14:40

Copilot AI review requested due to automatic review settings January 20, 2026 14:40

Copilot started reviewing on behalf of MistEO January 20, 2026 14:40 View session

sourcery-ai bot reviewed Jan 20, 2026

View reviewed changes

Copilot AI reviewed Jan 20, 2026

View reviewed changes

MistEO merged commit ca636c1 into main Jan 20, 2026
26 of 40 checks passed

MistEO deleted the feat/reco_direct branch January 20, 2026 14:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: run_recognition_direct & get_default_recognition_param #1069

feat: run_recognition_direct & get_default_recognition_param #1069

MistEO commented Jan 16, 2026 •

edited by sourcery-ai bot

Loading

Uh oh!

sourcery-ai bot left a comment

Uh oh!

sourcery-ai bot Jan 20, 2026

Uh oh!

sourcery-ai bot Jan 20, 2026

Uh oh!

sourcery-ai bot Jan 20, 2026

Uh oh!

sourcery-ai bot Jan 20, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		Synchronously execute action logic for `entry`, returns action id. Returns `MaaInvalidId` on failure, or action id on success. You can get action details via `MaaTaskerGetActionDetail`.

		> Will not execute subsequent next steps.

	> Will not execute subsequent next steps.
	> Will not execute subsequent steps.

feat: run_recognition_direct & get_default_recognition_param #1069

feat: run_recognition_direct & get_default_recognition_param #1069

Conversation

MistEO commented Jan 16, 2026 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

由 Sourcery 提供的摘要

Summary by Sourcery

由 Sourcery 提供的摘要

Summary by Sourcery

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MistEO commented Jan 16, 2026 •

edited by sourcery-ai bot

Loading