-
Notifications
You must be signed in to change notification settings - Fork 0
Check Catalog
Agents Shipgate ships 80+ built-in checks across ~19 categories. Every check is deterministic, static, and exits cleanly on missing data. Default severities can be overridden per repo via checks.severity_overrides.
This page curates the foundational categories. For the complete, always-current list, browse it from the CLI (below) or read docs/checks.md.
You can also browse the live catalog from the CLI:
agents-shipgate list-checks
agents-shipgate list-checks --json
agents-shipgate explain SHIP-POLICY-APPROVAL-MISSING| Category | Checks | Covers |
|---|---|---|
inventory |
4 | Enumerability, wildcard exposure, surface size, low-confidence production surfaces |
schema |
3 | Broad free-text params, missing numeric bounds, free-form output |
auth |
4 | Missing/broad scopes, scope-coverage gaps |
scope |
2 | Tools outside declared purpose, prohibited tools present |
policy |
2 | Missing approval / confirmation policies |
side_effects |
1 | Missing idempotency on risky writes |
evidence |
4 | HITL evidence: approval traces, override reasons, high-risk exclusions, promotion criteria |
security |
3 | Injection / secret / sensitive-data exposure in the surface |
manifest |
5 | Stale suppressions/policies/overrides, missing owners, unused scopes |
baseline |
3 | Baseline drift and integrity |
documentation |
1 | Missing / too-short descriptions |
action_surface |
11 | Base→head action-surface diff: new/removed actions, scope & effect escalations, removed controls |
verify |
6 | Verifier-cycle trust-root checks (SHIP-VERIFY-*) — policy/baseline/CI/instruction weakening routed to human review |
api |
11 | OpenAI API artifacts: schema strictness, structured output, prompt/tool scope, operational readiness |
adk |
6 | Google ADK extraction (dynamic toolsets, callbacks, eval coverage) |
langchain |
2 | LangChain / LangGraph dynamic tool surfaces |
crewai |
2 | CrewAI dynamic tool surfaces |
codex_plugin |
6 | Codex plugin packages & marketplaces |
n8n |
5 | n8n workflow tool surfaces and credential stubs |
Counts move as checks are added; agents-shipgate list-checks --json is authoritative.
-
critical— strict CI exits20unless the finding is explicitly suppressed with a reason. -
high— requires human review; does not fail strict CI by default. Configure viaci.fail_on: [critical, high]. -
medium— review during release hardening. -
low/info— informational.
Suppressed findings (checks.ignore matched) keep their severity in the JSON report but are excluded from active counts and do not trigger CI failure.
| ID | Default | Fires when |
|---|---|---|
SHIP-INVENTORY-NOT-ENUMERABLE |
high | No tools were loaded from any source. The release gate fails closed. |
SHIP-INVENTORY-WILDCARD-TOOLS |
high | A source declares wildcard / all-tools exposure (wildcard: true in MCP). |
SHIP-INVENTORY-TOOL-SURFACE-TOO-LARGE |
medium | Tool count exceeds 50. |
SHIP-INVENTORY-LOW-CONFIDENCE-PRODUCTION-SURFACE |
high |
environment.target is production and at least one tool came from a low/medium-confidence extraction (typically the SDK static AST). |
| ID | Default | Fires when |
|---|---|---|
SHIP-DOC-MISSING-DESCRIPTION |
medium | Tool description is missing or shorter than 20 chars. |
SHIP-DOC-INJECTION-RISK |
medium · high if multi-match on write | Description contains instruction-override-like phrases (ignore previous instructions, you are now the system, etc). |
SHIP-DOC-SECRET-IN-DESCRIPTION |
medium · high if multi-match on write | Description matches secret patterns: sk-…, ghp_…, AKIA…, or `password |
| ID | Default | Fires when |
|---|---|---|
SHIP-SCHEMA-BROAD-FREE-TEXT |
high | A write/action-like tool has a free-form parameter named action, body, command, content, instructions, message, prompt, update(s). |
SHIP-SCHEMA-MISSING-BOUNDS |
high | A risky numeric parameter (amount, count, limit, quantity, total, refund_amount, max(imum)) on a write tool lacks a maximum. |
SHIP-SCHEMA-FREEFORM-OUTPUT |
medium | Tool returns string (or SDK -> str) — output may flow back into model context. |
| ID | Default | Fires when |
|---|---|---|
SHIP-AUTH-MISSING-SCOPE |
high | Write or sensitive-data tool has no declared auth scopes. |
SHIP-AUTH-MANIFEST-BROAD-SCOPE |
high |
permissions.scopes contains *, admin, :*, or write-all. |
SHIP-AUTH-TOOL-BROAD-SCOPE |
high | A tool's own scope list contains a broad value. |
SHIP-AUTH-SCOPE-COVERAGE-MISSING |
high | Tool requires scopes not covered by permissions.scopes (after wildcard expansion). |
| ID | Default | Fires when |
|---|---|---|
SHIP-SCOPE-TOOL-OUTSIDE-PURPOSE |
high |
agent.declared_purpose reads as read-only (tokenized) and a write-capable tool is attached. |
SHIP-SCOPE-PROHIBITED-TOOL-PRESENT |
high | Tokens of a prohibited_actions entry overlap with a tool's name/description AND no mitigating policy is declared. |
| ID | Default | Fires when |
|---|---|---|
SHIP-POLICY-APPROVAL-MISSING |
critical | Tool has a high-risk tag (destructive, infrastructure_change, financial_action, code_execution) at ≥ medium confidence and is not in require_approval_for_tools. |
SHIP-POLICY-CONFIRMATION-MISSING |
high | Tool has destructive, external_write, or customer_communication at ≥ medium and is not in require_confirmation_for_tools. |
| ID | Default | Fires when |
|---|---|---|
SHIP-SIDEFX-IDEMPOTENCY-MISSING |
high · critical when retry policy is known | Tool is a write with financial_action / destructive / external_write, and lacks idempotency evidence — no idempotency_key parameter, no idempotentHint: true annotation, and not in require_idempotency_for_tools. |
| ID | Default | Fires when |
|---|---|---|
SHIP-API-FUNCTION-SCHEMA-STRICTNESS |
high (medium when low-risk) | OpenAI API function schema lacks strict: true, additionalProperties: false, complete required, or has unbounded risky fields. |
SHIP-API-STRUCTURED-OUTPUT-READINESS |
high (medium for under-spec) | No response format declared, or response schema is too broad / missing decision/status enums / missing refusal/needs_review/error modeling. |
SHIP-API-PROMPT-TOOL-SCOPE-MISMATCH |
high (medium for missing approval language) | Prompt says "advise only" / "read-only" while write tools are enabled, OR high-risk tools lack approval/confirmation language in the prompt. |
SHIP-API-OPERATIONAL-READINESS |
medium (high for retry+non-idempotent) | Missing retry policy, timeouts, simple test cases, or per-tool success/failure output schemas; or a trace sample shows a required-approval tool called without approved: true. |
The
apifamily has grown well beyond these four foundational checks — see Categories at a glance above, and runagents-shipgate list-checksoragents-shipgate explain <ID>for the full set.
| ID | Default | Fires when |
|---|---|---|
SHIP-MANIFEST-STALE-SUPPRESSION |
medium |
checks.ignore references an unknown check ID or a tool not loaded. |
SHIP-MANIFEST-STALE-POLICY |
medium | A policies.require_* entry names a tool not loaded. |
SHIP-MANIFEST-STALE-RISK-OVERRIDE |
medium |
risk_overrides.tools references a tool not loaded. |
SHIP-MANIFEST-HIGH-RISK-OWNER-MISSING |
high |
environment.target is production_like or production and a high-risk tool has no owner declared in risk_overrides.tools.{tool}.owner. |
SHIP-MANIFEST-UNUSED-SCOPE |
medium · high if broad |
permissions.scopes contains a scope not required by any loaded tool (and not covered by a wildcard). |
Checks consume risk hints with confidence thresholds. As of v0.2 the keyword classifier is tokenized, so "deploy" matches the standalone token but not the substring inside deployments. Plurals are explicit in the keyword sets where production scopes commonly use them (refund and refunds, cluster and clusters, etc.). See core/risk_hints.py for the source of truth.
| Tag | Triggered by |
|---|---|
read_only |
HTTP GET (high), MCP readOnlyHint: true (high), *_preview SDK functions (high), name/description tokens get/list/lookup/search/status/preview/view (medium) |
write |
HTTP POST/PUT/PATCH/DELETE (high), MCP destructiveHint: true (high), name tokens create/update/write/send/refund/cancel/delete/remove/charge/issue (medium) |
destructive |
HTTP DELETE (high), MCP destructiveHint: true (high), tokens cancel/delete/destroy/remove (medium) |
financial_action |
tokens refund(s)/payment(s)/charge(s)/invoice(s)/billing in name, description, or scopes |
customer_communication |
tokens email(s)/message(s)/sms
|
external_write |
customer-comms tag + tokens send/external/customer and not effectively read-only |
sensitive_data_access |
tokens ssn/pii/personal/secret(s)/credential(s)
|
code_execution |
tokens bash/command/execute/python/shell
|
infrastructure_change |
tokens aws/azure/cluster(s)/deploy/droplet(s)/gcp/kubernetes/terraform
|
To override these heuristics, use risk_overrides.tools — it always wins.
External packages can register checks via the agents_shipgate.checks Python entry point. Plugins are disabled by default and must be opted in:
AGENTS_SHIPGATE_ENABLE_PLUGINS=1 agents-shipgate scan
agents-shipgate scan --no-plugins # force off even if env is setThe report's loaded_plugins array enumerates every third-party check that ran, with distribution name and version. See Plugin Authoring for the contract.
Agents Shipgate · Apache-2.0 · maintained by Three Moons Lab · Report a false positive
Getting started
Reference
Workflows
Extending
Project