-
Notifications
You must be signed in to change notification settings - Fork 342
feat: run_recognition_direct & get_default_recognition_param #1069
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey - 我发现了 5 个问题,并给出了一些整体性的反馈:
- 在 NodeJS 绑定中(
ContextImpl::run_recognition_direct/run_action_direct),maajs::EnvType参数在函数定义中没有命名,但在函数体中却使用了env(例如maajs::JsonStringify(env, ...)),这会导致无法通过编译;请在函数签名中给该参数命名,或者在函数体中停止使用env。 - 在
Context::run_recognition_direct/run_action_direct中使用了std::format,但Context.cpp并未显式包含<format>;建议在此处添加该头文件,避免依赖传递式包含以及潜在的构建问题。
给 AI Agent 的提示词
Please address the comments from this code review:
## Overall Comments
- In the NodeJS binding (`ContextImpl::run_recognition_direct` / `run_action_direct`), the `maajs::EnvType` parameter is unnamed in the definition but `env` is used in the body (e.g. in `maajs::JsonStringify(env, ...)`), which will not compile; give the parameter a name in the signature or stop using `env` inside.
- In `Context::run_recognition_direct` / `run_action_direct` you use `std::format` but `Context.cpp` does not include `<format>` explicitly; consider adding the header here to avoid relying on transitive includes and potential build issues.
## Individual Comments
### Comment 1
<location> `source/binding/NodeJS/src/apis/context.cpp:76-85` </location>
<code_context>
});
}
+maajs::PromiseType ContextImpl::run_recognition_direct(
+ maajs::ValueType self,
+ maajs::EnvType,
+ std::string reco_type,
+ maajs::ValueType reco_param,
+ maajs::ArrayBufferType image)
+{
+ auto buf = std::make_shared<ImageBuffer>();
+ buf->set(image);
+ auto param_str = maajs::JsonStringify(env, reco_param);
+ auto worker = new maajs::AsyncWork<MaaRecoId>(env, [context = context, reco_type, param_str, buf]() {
+ return MaaContextRunRecognitionDirect(context, reco_type.c_str(), param_str.c_str(), *buf);
+ });
</code_context>
<issue_to_address>
**issue (bug_risk):** The `env` parameter is unnamed but still used, causing a compilation error.
In both `run_recognition_direct` and `run_action_direct`, the parameter list declares `maajs::EnvType` without a name, but `env` is used inside (e.g., in `maajs::JsonStringify(env, ...)` and `maajs::AsyncWork`). This will not compile. Please give the parameter a name (e.g., `maajs::EnvType env`) or obtain the environment consistently with the other methods in this file.
</issue_to_address>
### Comment 2
<location> `source/binding/NodeJS/src/apis/context.cpp:97-106` </location>
<code_context>
+ });
+}
+
+maajs::PromiseType ContextImpl::run_action_direct(
+ maajs::ValueType self,
+ maajs::EnvType,
+ std::string action_type,
+ maajs::ValueType action_param,
+ MaaRect box,
+ std::string reco_detail)
+{
+ auto param_str = maajs::JsonStringify(env, action_param);
+ auto worker = new maajs::AsyncWork<MaaActId>(env, [context = context, action_type, param_str, box, reco_detail]() {
+ return MaaContextRunActionDirect(context, action_type.c_str(), param_str.c_str(), &box, reco_detail.c_str());
+ });
</code_context>
<issue_to_address>
**issue (bug_risk):** The same `env` naming issue appears in `run_action_direct` and needs to be fixed there as well.
Here `maajs::EnvType` is declared without a name, but `env` is used in the body. Add the parameter name (e.g., `maajs::EnvType env`) and ensure usage matches, otherwise this won’t compile.
</issue_to_address>
### Comment 3
<location> `test/python/binding_test.py:118-135` </location>
<code_context>
reco_detail = context.run_recognition(entry, argv.image, ppover)
print(f" reco_detail: {reco_detail}")
+ # 测试 run_recognition_direct
+ reco_direct_detail = context.run_recognition_direct(
+ JRecognitionType.OCR, JOCR(), argv.image
+ )
+ print(f" reco_direct_detail: {reco_direct_detail}")
+
+ # 测试 run_action_direct
+ action_direct_detail = context.run_action_direct(
+ JActionType.Click, JClick(), (100, 100, 50, 50), ""
+ )
+ print(f" action_direct_detail: {action_direct_detail}")
+
# 测试 clone 和 override
</code_context>
<issue_to_address>
**suggestion (testing):** Add assertions for `run_recognition_direct` / `run_action_direct` instead of only printing results
Right now these calls only print their results, so the tests don't actually validate behavior. Please add assertions that:
- The returned detail objects are non-`None` for valid inputs.
- `reco_direct_detail.hit` / `action_direct_detail.success` match the expected outcome for the test data.
Also add at least one negative-case assertion (e.g., invalid type or empty image) to confirm these methods return `None` when the underlying C API fails to start the flow.
```suggestion
reco_detail = context.run_recognition(entry, argv.image, ppover)
print(f" reco_detail: {reco_detail}")
# 测试 run_recognition_direct
reco_direct_detail = context.run_recognition_direct(
JRecognitionType.OCR, JOCR(), argv.image
)
print(f" reco_direct_detail: {reco_direct_detail}")
# 断言 run_recognition_direct 返回非 None,且与 run_recognition 结果一致
assert reco_direct_detail is not None, "run_recognition_direct should return detail for valid input"
# 对比 direct 和 pipeline 识别结果的一致性,而不是硬编码期望
assert getattr(reco_direct_detail, "hit", None) == getattr(
reco_detail, "hit", None
), "run_recognition_direct.hit should match run_recognition.hit for the same input"
# 测试 run_action_direct
action_direct_detail = context.run_action_direct(
JActionType.Click, JClick(), (100, 100, 50, 50), ""
)
print(f" action_direct_detail: {action_direct_detail}")
# 断言 run_action_direct 返回非 None,并且动作执行成功
assert action_direct_detail is not None, "run_action_direct should return detail for valid input"
assert getattr(
action_direct_detail, "success", False
), "run_action_direct.success should be True for a valid click action"
# negative cases: 确认底层 C API 启动失败时返回 None
bad_reco_detail = context.run_recognition_direct(
JRecognitionType.OCR, JOCR(), "" # 空路径/无效图片
)
assert (
bad_reco_detail is None
), "run_recognition_direct should return None when given an invalid/empty image path"
bad_action_detail = context.run_action_direct(
JActionType.Click, JClick(), (0, 0, 0, 0), "" # 无效矩形参数
)
assert (
bad_action_detail is None
), "run_action_direct should return None when the action cannot be started due to invalid parameters"
# 测试 clone 和 override
new_ctx = context.clone()
new_ctx.override_pipeline({"TaskA": {}, "TaskB": {}})
```
</issue_to_address>
### Comment 4
<location> `test/agent/agent_child_test.py:107-116` </location>
<code_context>
reco_detail = context.run_recognition(entry, argv.image, ppover)
print(f" reco_detail: {reco_detail}")
+ # 测试 run_recognition_direct
+ reco_direct_detail = context.run_recognition_direct(
+ JRecognitionType.OCR, JOCR(), argv.image
+ )
+ print(f" reco_direct_detail: {reco_direct_detail}")
+
+ # 测试 run_action_direct
+ action_direct_detail = context.run_action_direct(
+ JActionType.Click, JClick(), (100, 100, 50, 50), ""
+ )
+ print(f" action_direct_detail: {action_direct_detail}")
+
# 测试 clone 和 override
</code_context>
<issue_to_address>
**suggestion (testing):** Assert agent-side behavior for direct run APIs, not just that calls succeed
Right now this test only prints the results of `run_recognition_direct` and `run_action_direct`, so it won’t fail if the agent behavior regresses. Please add assertions that:
- `reco_direct_detail` / `action_direct_detail` are non-`None` for valid inputs.
- `reco_direct_detail.hit` and `action_direct_detail.success` match the expected outcome.
Also consider a negative-path case (e.g., invalid type) to verify the agent returns `None` when the server fails to start the recognition/action.
Suggested implementation:
```python
# 测试 run_recognition_direct
reco_direct_detail = context.run_recognition_direct(
JRecognitionType.OCR, JOCR(), argv.image
)
print(f" reco_direct_detail: {reco_direct_detail}")
# 断言直连识别的返回值和行为
assert reco_direct_detail is not None, "run_recognition_direct should return a detail object for valid input"
assert hasattr(
reco_direct_detail, "hit"
), "reco_direct_detail should have a 'hit' attribute"
# 与正常 pipeline 识别结果保持一致,防止代理实现行为回退
assert (
reco_direct_detail.hit == reco_detail.hit
), "Direct recognition hit flag should match pipeline recognition"
# 测试 run_action_direct
action_direct_detail = context.run_action_direct(
JActionType.Click, JClick(), (100, 100, 50, 50), ""
)
print(f" action_direct_detail: {action_direct_detail}")
# 断言直连动作的返回值和行为
assert action_direct_detail is not None, "run_action_direct should return a detail object for valid input"
assert hasattr(
action_direct_detail, "success"
), "action_direct_detail should have a 'success' attribute"
assert (
action_direct_detail.success is True
), "Direct click action should succeed for a valid region"
# 负路径:使用明显无效的输入,确保直连接口在代理无法启动识别/动作时返回 None
invalid_reco_direct_detail = context.run_recognition_direct(
JRecognitionType.OCR, JOCR(), "this_image_does_not_exist.png"
)
assert (
invalid_reco_direct_detail is None
), "run_recognition_direct should return None when recognition cannot be started (invalid image/input)"
invalid_action_direct_detail = context.run_action_direct(
JActionType.Click, JClick(), (0, 0, 0, 0), ""
)
# 空区域视作无效,代理应当无法执行该动作
assert (
invalid_action_direct_detail is None
), "run_action_direct should return None when action cannot be started (invalid region/input)"
```
Depending on the actual semantics of `run_recognition_direct` / `run_action_direct`, you may need to tweak the negative-path expectations:
1. If the implementation raises exceptions instead of returning `None` for invalid inputs, wrap the negative calls in `with pytest.raises(...)` instead of asserting `None`.
2. If a different kind of invalid input is required to trigger the "server fails to start" path (e.g., invalid `JRecognitionType` / `JActionType` or an invalid `context` state), adjust the two negative tests to use that condition while keeping the core assertion that the agent returns `None` (or raises a specific exception) in that case.
3. If `action_direct_detail` uses a different field name than `success` (e.g., `ok`, `status`), update the attribute access accordingly.
</issue_to_address>
### Comment 5
<location> `docs/en_us/2.2-IntegratedInterfaceOverview.md:727` </location>
<code_context>
> Will not execute subsequent next steps.
+### MaaContextRunRecognitionDirect
</code_context>
<issue_to_address>
**suggestion (typo):** Avoid redundant phrasing in "subsequent next steps."
Use either “subsequent steps” or “next steps” to avoid the redundant phrasing.
```suggestion
> Will not execute subsequent steps.
```
</issue_to_address>帮助我变得更有用!请在每条评论上点击 👍 或 👎,我会根据你的反馈改进后续的 Review。
Original comment in English
Hey - I've found 5 issues, and left some high level feedback:
- In the NodeJS binding (
ContextImpl::run_recognition_direct/run_action_direct), themaajs::EnvTypeparameter is unnamed in the definition butenvis used in the body (e.g. inmaajs::JsonStringify(env, ...)), which will not compile; give the parameter a name in the signature or stop usingenvinside. - In
Context::run_recognition_direct/run_action_directyou usestd::formatbutContext.cppdoes not include<format>explicitly; consider adding the header here to avoid relying on transitive includes and potential build issues.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In the NodeJS binding (`ContextImpl::run_recognition_direct` / `run_action_direct`), the `maajs::EnvType` parameter is unnamed in the definition but `env` is used in the body (e.g. in `maajs::JsonStringify(env, ...)`), which will not compile; give the parameter a name in the signature or stop using `env` inside.
- In `Context::run_recognition_direct` / `run_action_direct` you use `std::format` but `Context.cpp` does not include `<format>` explicitly; consider adding the header here to avoid relying on transitive includes and potential build issues.
## Individual Comments
### Comment 1
<location> `source/binding/NodeJS/src/apis/context.cpp:76-85` </location>
<code_context>
});
}
+maajs::PromiseType ContextImpl::run_recognition_direct(
+ maajs::ValueType self,
+ maajs::EnvType,
+ std::string reco_type,
+ maajs::ValueType reco_param,
+ maajs::ArrayBufferType image)
+{
+ auto buf = std::make_shared<ImageBuffer>();
+ buf->set(image);
+ auto param_str = maajs::JsonStringify(env, reco_param);
+ auto worker = new maajs::AsyncWork<MaaRecoId>(env, [context = context, reco_type, param_str, buf]() {
+ return MaaContextRunRecognitionDirect(context, reco_type.c_str(), param_str.c_str(), *buf);
+ });
</code_context>
<issue_to_address>
**issue (bug_risk):** The `env` parameter is unnamed but still used, causing a compilation error.
In both `run_recognition_direct` and `run_action_direct`, the parameter list declares `maajs::EnvType` without a name, but `env` is used inside (e.g., in `maajs::JsonStringify(env, ...)` and `maajs::AsyncWork`). This will not compile. Please give the parameter a name (e.g., `maajs::EnvType env`) or obtain the environment consistently with the other methods in this file.
</issue_to_address>
### Comment 2
<location> `source/binding/NodeJS/src/apis/context.cpp:97-106` </location>
<code_context>
+ });
+}
+
+maajs::PromiseType ContextImpl::run_action_direct(
+ maajs::ValueType self,
+ maajs::EnvType,
+ std::string action_type,
+ maajs::ValueType action_param,
+ MaaRect box,
+ std::string reco_detail)
+{
+ auto param_str = maajs::JsonStringify(env, action_param);
+ auto worker = new maajs::AsyncWork<MaaActId>(env, [context = context, action_type, param_str, box, reco_detail]() {
+ return MaaContextRunActionDirect(context, action_type.c_str(), param_str.c_str(), &box, reco_detail.c_str());
+ });
</code_context>
<issue_to_address>
**issue (bug_risk):** The same `env` naming issue appears in `run_action_direct` and needs to be fixed there as well.
Here `maajs::EnvType` is declared without a name, but `env` is used in the body. Add the parameter name (e.g., `maajs::EnvType env`) and ensure usage matches, otherwise this won’t compile.
</issue_to_address>
### Comment 3
<location> `test/python/binding_test.py:118-135` </location>
<code_context>
reco_detail = context.run_recognition(entry, argv.image, ppover)
print(f" reco_detail: {reco_detail}")
+ # 测试 run_recognition_direct
+ reco_direct_detail = context.run_recognition_direct(
+ JRecognitionType.OCR, JOCR(), argv.image
+ )
+ print(f" reco_direct_detail: {reco_direct_detail}")
+
+ # 测试 run_action_direct
+ action_direct_detail = context.run_action_direct(
+ JActionType.Click, JClick(), (100, 100, 50, 50), ""
+ )
+ print(f" action_direct_detail: {action_direct_detail}")
+
# 测试 clone 和 override
</code_context>
<issue_to_address>
**suggestion (testing):** Add assertions for `run_recognition_direct` / `run_action_direct` instead of only printing results
Right now these calls only print their results, so the tests don't actually validate behavior. Please add assertions that:
- The returned detail objects are non-`None` for valid inputs.
- `reco_direct_detail.hit` / `action_direct_detail.success` match the expected outcome for the test data.
Also add at least one negative-case assertion (e.g., invalid type or empty image) to confirm these methods return `None` when the underlying C API fails to start the flow.
```suggestion
reco_detail = context.run_recognition(entry, argv.image, ppover)
print(f" reco_detail: {reco_detail}")
# 测试 run_recognition_direct
reco_direct_detail = context.run_recognition_direct(
JRecognitionType.OCR, JOCR(), argv.image
)
print(f" reco_direct_detail: {reco_direct_detail}")
# 断言 run_recognition_direct 返回非 None,且与 run_recognition 结果一致
assert reco_direct_detail is not None, "run_recognition_direct should return detail for valid input"
# 对比 direct 和 pipeline 识别结果的一致性,而不是硬编码期望
assert getattr(reco_direct_detail, "hit", None) == getattr(
reco_detail, "hit", None
), "run_recognition_direct.hit should match run_recognition.hit for the same input"
# 测试 run_action_direct
action_direct_detail = context.run_action_direct(
JActionType.Click, JClick(), (100, 100, 50, 50), ""
)
print(f" action_direct_detail: {action_direct_detail}")
# 断言 run_action_direct 返回非 None,并且动作执行成功
assert action_direct_detail is not None, "run_action_direct should return detail for valid input"
assert getattr(
action_direct_detail, "success", False
), "run_action_direct.success should be True for a valid click action"
# negative cases: 确认底层 C API 启动失败时返回 None
bad_reco_detail = context.run_recognition_direct(
JRecognitionType.OCR, JOCR(), "" # 空路径/无效图片
)
assert (
bad_reco_detail is None
), "run_recognition_direct should return None when given an invalid/empty image path"
bad_action_detail = context.run_action_direct(
JActionType.Click, JClick(), (0, 0, 0, 0), "" # 无效矩形参数
)
assert (
bad_action_detail is None
), "run_action_direct should return None when the action cannot be started due to invalid parameters"
# 测试 clone 和 override
new_ctx = context.clone()
new_ctx.override_pipeline({"TaskA": {}, "TaskB": {}})
```
</issue_to_address>
### Comment 4
<location> `test/agent/agent_child_test.py:107-116` </location>
<code_context>
reco_detail = context.run_recognition(entry, argv.image, ppover)
print(f" reco_detail: {reco_detail}")
+ # 测试 run_recognition_direct
+ reco_direct_detail = context.run_recognition_direct(
+ JRecognitionType.OCR, JOCR(), argv.image
+ )
+ print(f" reco_direct_detail: {reco_direct_detail}")
+
+ # 测试 run_action_direct
+ action_direct_detail = context.run_action_direct(
+ JActionType.Click, JClick(), (100, 100, 50, 50), ""
+ )
+ print(f" action_direct_detail: {action_direct_detail}")
+
# 测试 clone 和 override
</code_context>
<issue_to_address>
**suggestion (testing):** Assert agent-side behavior for direct run APIs, not just that calls succeed
Right now this test only prints the results of `run_recognition_direct` and `run_action_direct`, so it won’t fail if the agent behavior regresses. Please add assertions that:
- `reco_direct_detail` / `action_direct_detail` are non-`None` for valid inputs.
- `reco_direct_detail.hit` and `action_direct_detail.success` match the expected outcome.
Also consider a negative-path case (e.g., invalid type) to verify the agent returns `None` when the server fails to start the recognition/action.
Suggested implementation:
```python
# 测试 run_recognition_direct
reco_direct_detail = context.run_recognition_direct(
JRecognitionType.OCR, JOCR(), argv.image
)
print(f" reco_direct_detail: {reco_direct_detail}")
# 断言直连识别的返回值和行为
assert reco_direct_detail is not None, "run_recognition_direct should return a detail object for valid input"
assert hasattr(
reco_direct_detail, "hit"
), "reco_direct_detail should have a 'hit' attribute"
# 与正常 pipeline 识别结果保持一致,防止代理实现行为回退
assert (
reco_direct_detail.hit == reco_detail.hit
), "Direct recognition hit flag should match pipeline recognition"
# 测试 run_action_direct
action_direct_detail = context.run_action_direct(
JActionType.Click, JClick(), (100, 100, 50, 50), ""
)
print(f" action_direct_detail: {action_direct_detail}")
# 断言直连动作的返回值和行为
assert action_direct_detail is not None, "run_action_direct should return a detail object for valid input"
assert hasattr(
action_direct_detail, "success"
), "action_direct_detail should have a 'success' attribute"
assert (
action_direct_detail.success is True
), "Direct click action should succeed for a valid region"
# 负路径:使用明显无效的输入,确保直连接口在代理无法启动识别/动作时返回 None
invalid_reco_direct_detail = context.run_recognition_direct(
JRecognitionType.OCR, JOCR(), "this_image_does_not_exist.png"
)
assert (
invalid_reco_direct_detail is None
), "run_recognition_direct should return None when recognition cannot be started (invalid image/input)"
invalid_action_direct_detail = context.run_action_direct(
JActionType.Click, JClick(), (0, 0, 0, 0), ""
)
# 空区域视作无效,代理应当无法执行该动作
assert (
invalid_action_direct_detail is None
), "run_action_direct should return None when action cannot be started (invalid region/input)"
```
Depending on the actual semantics of `run_recognition_direct` / `run_action_direct`, you may need to tweak the negative-path expectations:
1. If the implementation raises exceptions instead of returning `None` for invalid inputs, wrap the negative calls in `with pytest.raises(...)` instead of asserting `None`.
2. If a different kind of invalid input is required to trigger the "server fails to start" path (e.g., invalid `JRecognitionType` / `JActionType` or an invalid `context` state), adjust the two negative tests to use that condition while keeping the core assertion that the agent returns `None` (or raises a specific exception) in that case.
3. If `action_direct_detail` uses a different field name than `success` (e.g., `ok`, `status`), update the attribute access accordingly.
</issue_to_address>
### Comment 5
<location> `docs/en_us/2.2-IntegratedInterfaceOverview.md:727` </location>
<code_context>
> Will not execute subsequent next steps.
+### MaaContextRunRecognitionDirect
</code_context>
<issue_to_address>
**suggestion (typo):** Avoid redundant phrasing in "subsequent next steps."
Use either “subsequent steps” or “next steps” to avoid the redundant phrasing.
```suggestion
> Will not execute subsequent steps.
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| maajs::PromiseType ContextImpl::run_recognition_direct( | ||
| maajs::ValueType self, | ||
| maajs::EnvType, | ||
| std::string reco_type, | ||
| maajs::ValueType reco_param, | ||
| maajs::ArrayBufferType image) | ||
| { | ||
| auto buf = std::make_shared<ImageBuffer>(); | ||
| buf->set(image); | ||
| auto param_str = maajs::JsonStringify(env, reco_param); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (bug_risk): env 参数未命名却仍在使用,这会导致编译错误。
在 run_recognition_direct 和 run_action_direct 中,参数列表里声明了未命名的 maajs::EnvType,但在函数体中使用了 env(例如 maajs::JsonStringify(env, ...) 和 maajs::AsyncWork)。这样无法通过编译。请为该参数添加名称(例如 maajs::EnvType env),或者像本文件中的其他方法一样,以一致的方式获取环境对象。
Original comment in English
issue (bug_risk): The env parameter is unnamed but still used, causing a compilation error.
In both run_recognition_direct and run_action_direct, the parameter list declares maajs::EnvType without a name, but env is used inside (e.g., in maajs::JsonStringify(env, ...) and maajs::AsyncWork). This will not compile. Please give the parameter a name (e.g., maajs::EnvType env) or obtain the environment consistently with the other methods in this file.
| maajs::PromiseType ContextImpl::run_action_direct( | ||
| maajs::ValueType self, | ||
| maajs::EnvType, | ||
| std::string action_type, | ||
| maajs::ValueType action_param, | ||
| MaaRect box, | ||
| std::string reco_detail) | ||
| { | ||
| auto param_str = maajs::JsonStringify(env, action_param); | ||
| auto worker = new maajs::AsyncWork<MaaActId>(env, [context = context, action_type, param_str, box, reco_detail]() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (bug_risk): 相同的 env 命名问题也出现在 run_action_direct 中,需要一并修复。
这里 maajs::EnvType 也是未命名的,但函数体中使用了 env。请添加参数名(例如 maajs::EnvType env),并确保使用方式一致,否则同样无法编译。
Original comment in English
issue (bug_risk): The same env naming issue appears in run_action_direct and needs to be fixed there as well.
Here maajs::EnvType is declared without a name, but env is used in the body. Add the parameter name (e.g., maajs::EnvType env) and ensure usage matches, otherwise this won’t compile.
| # 测试 run_recognition_direct | ||
| reco_direct_detail = context.run_recognition_direct( | ||
| JRecognitionType.OCR, JOCR(), argv.image | ||
| ) | ||
| print(f" reco_direct_detail: {reco_direct_detail}") | ||
|
|
||
| # 测试 run_action_direct | ||
| action_direct_detail = context.run_action_direct( | ||
| JActionType.Click, JClick(), (100, 100, 50, 50), "" | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): 针对直连运行的 API 断言代理侧行为,而不仅仅是调用成功
目前这个测试只打印了 run_recognition_direct 和 run_action_direct 的结果,如果代理行为回归,测试也不会失败。请补充以下断言:
- 对于合法输入,
reco_direct_detail/action_direct_detail不为None。 reco_direct_detail.hit和action_direct_detail.success与预期结果一致。
另外建议增加一条负路径测试(例如无效类型),以验证当服务端无法启动识别/动作时,代理会返回None。
建议实现如下:
# 测试 run_recognition_direct
reco_direct_detail = context.run_recognition_direct(
JRecognitionType.OCR, JOCR(), argv.image
)
print(f" reco_direct_detail: {reco_direct_detail}")
# 断言直连识别的返回值和行为
assert reco_direct_detail is not None, "run_recognition_direct should return a detail object for valid input"
assert hasattr(
reco_direct_detail, "hit"
), "reco_direct_detail should have a 'hit' attribute"
# 与正常 pipeline 识别结果保持一致,防止代理实现行为回退
assert (
reco_direct_detail.hit == reco_detail.hit
), "Direct recognition hit flag should match pipeline recognition"
# 测试 run_action_direct
action_direct_detail = context.run_action_direct(
JActionType.Click, JClick(), (100, 100, 50, 50), ""
)
print(f" action_direct_detail: {action_direct_detail}")
# 断言直连动作的返回值和行为
assert action_direct_detail is not None, "run_action_direct should return a detail object for valid input"
assert hasattr(
action_direct_detail, "success"
), "action_direct_detail should have a 'success' attribute"
assert (
action_direct_detail.success is True
), "Direct click action should succeed for a valid region"
# 负路径:使用明显无效的输入,确保直连接口在代理无法启动识别/动作时返回 None
invalid_reco_direct_detail = context.run_recognition_direct(
JRecognitionType.OCR, JOCR(), "this_image_does_not_exist.png"
)
assert (
invalid_reco_direct_detail is None
), "run_recognition_direct should return None when recognition cannot be started (invalid image/input)"
invalid_action_direct_detail = context.run_action_direct(
JActionType.Click, JClick(), (0, 0, 0, 0), ""
)
# 空区域视作无效,代理应当无法执行该动作
assert (
invalid_action_direct_detail is None
), "run_action_direct should return None when action cannot be started (invalid region/input)"根据 run_recognition_direct / run_action_direct 实际的语义,你可能需要对负路径的预期稍作调整:
- 如果实现对无效输入抛出异常而不是返回
None,请使用with pytest.raises(...)包裹负路径调用,而不是断言返回None。 - 如果需要其他类型的无效输入才能触发“服务端启动失败”路径(例如无效的
JRecognitionType/JActionType,或无效的context状态),请据此调整这两个负路径测试,同时保持核心断言:在这种情况下代理会返回None(或抛出特定异常)。 - 如果
action_direct_detail使用的字段名不是success(例如ok、status),请相应更新属性访问。
Original comment in English
suggestion (testing): Assert agent-side behavior for direct run APIs, not just that calls succeed
Right now this test only prints the results of run_recognition_direct and run_action_direct, so it won’t fail if the agent behavior regresses. Please add assertions that:
reco_direct_detail/action_direct_detailare non-Nonefor valid inputs.reco_direct_detail.hitandaction_direct_detail.successmatch the expected outcome.
Also consider a negative-path case (e.g., invalid type) to verify the agent returnsNonewhen the server fails to start the recognition/action.
Suggested implementation:
# 测试 run_recognition_direct
reco_direct_detail = context.run_recognition_direct(
JRecognitionType.OCR, JOCR(), argv.image
)
print(f" reco_direct_detail: {reco_direct_detail}")
# 断言直连识别的返回值和行为
assert reco_direct_detail is not None, "run_recognition_direct should return a detail object for valid input"
assert hasattr(
reco_direct_detail, "hit"
), "reco_direct_detail should have a 'hit' attribute"
# 与正常 pipeline 识别结果保持一致,防止代理实现行为回退
assert (
reco_direct_detail.hit == reco_detail.hit
), "Direct recognition hit flag should match pipeline recognition"
# 测试 run_action_direct
action_direct_detail = context.run_action_direct(
JActionType.Click, JClick(), (100, 100, 50, 50), ""
)
print(f" action_direct_detail: {action_direct_detail}")
# 断言直连动作的返回值和行为
assert action_direct_detail is not None, "run_action_direct should return a detail object for valid input"
assert hasattr(
action_direct_detail, "success"
), "action_direct_detail should have a 'success' attribute"
assert (
action_direct_detail.success is True
), "Direct click action should succeed for a valid region"
# 负路径:使用明显无效的输入,确保直连接口在代理无法启动识别/动作时返回 None
invalid_reco_direct_detail = context.run_recognition_direct(
JRecognitionType.OCR, JOCR(), "this_image_does_not_exist.png"
)
assert (
invalid_reco_direct_detail is None
), "run_recognition_direct should return None when recognition cannot be started (invalid image/input)"
invalid_action_direct_detail = context.run_action_direct(
JActionType.Click, JClick(), (0, 0, 0, 0), ""
)
# 空区域视作无效,代理应当无法执行该动作
assert (
invalid_action_direct_detail is None
), "run_action_direct should return None when action cannot be started (invalid region/input)"Depending on the actual semantics of run_recognition_direct / run_action_direct, you may need to tweak the negative-path expectations:
- If the implementation raises exceptions instead of returning
Nonefor invalid inputs, wrap the negative calls inwith pytest.raises(...)instead of assertingNone. - If a different kind of invalid input is required to trigger the "server fails to start" path (e.g., invalid
JRecognitionType/JActionTypeor an invalidcontextstate), adjust the two negative tests to use that condition while keeping the core assertion that the agent returnsNone(or raises a specific exception) in that case. - If
action_direct_detailuses a different field name thansuccess(e.g.,ok,status), update the attribute access accordingly.
|
|
||
| Synchronously execute action logic for `entry`, returns action id. Returns `MaaInvalidId` on failure, or action id on success. You can get action details via `MaaTaskerGetActionDetail`. | ||
|
|
||
| > Will not execute subsequent next steps. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (typo): 避免使用“subsequent next steps”这种重复表述。
建议只使用 “subsequent steps” 或 “next steps” 之一,以避免措辞冗余。
| > Will not execute subsequent next steps. | |
| > Will not execute subsequent steps. |
Original comment in English
suggestion (typo): Avoid redundant phrasing in "subsequent next steps."
Use either “subsequent steps” or “next steps” to avoid the redundant phrasing.
| > Will not execute subsequent next steps. | |
| > Will not execute subsequent steps. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces direct recognition and action execution APIs that bypass pipeline entries, along with methods to retrieve default parameters for recognition and action types from the DefaultPipelineMgr.
Changes:
- Added
MaaContextRunRecognitionDirectandMaaContextRunActionDirectC APIs with corresponding implementations across Python, NodeJS, and agent protocol bindings - Added
MaaResourceGetDefaultRecognitionParamandMaaResourceGetDefaultActionParamC APIs with full binding support - Extended agent protocol with new message types to support cross-process communication for the new APIs
Reviewed changes
Copilot reviewed 28 out of 30 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| include/MaaFramework/Instance/MaaContext.h | Added C API declarations for direct recognition/action execution |
| include/MaaFramework/Instance/MaaResource.h | Added C API declarations for default parameter retrieval |
| source/Common/MaaContext.cpp | Implemented C API functions with proper validation and JSON parsing |
| source/Common/MaaResource.cpp | Implemented C API functions for default parameter retrieval |
| source/MaaFramework/Task/Context.cpp | Core implementation creating pipeline overrides with UUIDs for direct execution |
| source/MaaFramework/Resource/ResourceMgr.cpp | Comprehensive switch statements covering all recognition and action types for default parameter retrieval |
| source/include/Common/MaaTypes.h | Added pure virtual methods to MaaResource and MaaContext interfaces |
| source/include/MaaAgent/Message.hpp | Defined new message structures for agent protocol communication |
| source/MaaAgentServer/RemoteInstance/RemoteContext.cpp | Implemented remote context methods forwarding requests to agent client |
| source/MaaAgentServer/RemoteInstance/RemoteResource.cpp | Implemented remote resource methods for parameter retrieval |
| source/MaaAgentClient/Client/AgentClient.cpp | Added request handlers for new context and resource operations |
| source/binding/Python/maa/context.py | Exposed direct execution methods with proper parameter serialization using dataclasses |
| source/binding/Python/maa/resource.py | Added methods to retrieve and parse default parameters |
| source/binding/NodeJS/src/apis/context.cpp | Implemented async wrappers for direct execution methods |
| source/binding/NodeJS/src/apis/resource.cpp | Added synchronous parameter retrieval methods |
| source/binding/NodeJS/src/apis/context.d.ts | TypeScript type definitions for new context methods |
| source/binding/NodeJS/src/apis/resource.d.ts | TypeScript type definitions for new resource methods |
| source/modules/MaaFramework.cppm | Exported new C API functions in C++20 module |
| test/python/binding_test.py | Added tests for direct execution and default parameter retrieval |
| test/agent/agent_child_test.py | Verified agent protocol support for new APIs |
| docs/en_us/2.2-IntegratedInterfaceOverview.md | Documented new APIs with parameter descriptions and usage notes |
| AGENTS.md | Updated development guide to mention ResourceMgr.cpp updates for new types |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -1,4 +1,5 @@ | |||
| import ctypes | |||
| import dataclasses | |||
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Module 'dataclasses' is imported with both 'import' and 'import from'.
| import dataclasses |
由 Sourcery 提供的摘要
为识别与动作添加直接执行 API,并在 core、agent、Python 和 NodeJS 绑定中暴露识别与动作类型的默认参数获取接口,并附带相应的文档与测试。
新功能:
MaaContextRunRecognitionDirect和MaaContextRunActionDirect,可通过类型和参数直接运行识别与动作,而无需定义流水线条目。MaaResourceGetDefaultRecognitionParam和MaaResourceGetDefaultActionParam,用于从默认的流水线管理器中获取内置识别与动作类型的默认参数。增强内容:
RemoteContext和RemoteResource,以支持新的直接执行能力和默认参数获取能力。测试:
Original summary in English
Summary by Sourcery
Add direct execution APIs for recognition and actions and expose default parameter retrieval for recognition and action types across core, agent, Python, and NodeJS bindings, with corresponding documentation and tests.
New Features:
Enhancements:
Tests:
Original summary in English
由 Sourcery 提供的摘要
为识别与动作添加直接执行 API,并在 core、agent、Python 和 NodeJS 绑定中暴露识别与动作类型的默认参数获取接口,并附带相应的文档与测试。
新功能:
MaaContextRunRecognitionDirect和MaaContextRunActionDirect,可通过类型和参数直接运行识别与动作,而无需定义流水线条目。MaaResourceGetDefaultRecognitionParam和MaaResourceGetDefaultActionParam,用于从默认的流水线管理器中获取内置识别与动作类型的默认参数。增强内容:
RemoteContext和RemoteResource,以支持新的直接执行能力和默认参数获取能力。测试:
Original summary in English
Summary by Sourcery
Add direct execution APIs for recognition and actions and expose default parameter retrieval for recognition and action types across core, agent, Python, and NodeJS bindings, with corresponding documentation and tests.
New Features:
Enhancements:
Tests: