A universal engineering job contract system for Copilot-first and manual workflows with strict validation, bounded scope, and human review.
- Universal job contract: Executor-neutral job specifications for bounded engineering work
- Copilot-first workflow: Works with GitHub Copilot, manual workflows, or mock testing
- Package mode: Create work packages for Copilot/manual execution without tool-launched agents
- Strict validation: Fail-closed schema validation with clear error messages
- Artifact capture: Deterministic provenance tracking and human review artifacts
- Human review required: All workflows require human review before commit
- Not a finished multi-agent control plane
- Not a production platform, queue, or dashboard
- Not a control plane or provider fanout layer
- Not an autonomous coding system
- Not an auto-commit or auto-push tool
- Not a Copilot automation bypass
| Mode | Target/Executor | Tool Launches Agent? | Use Case |
|---|---|---|---|
| render | copilot | No | Generate Copilot-ready prompt |
| render | manual | No | Generate human work order |
| package | copilot | No | Create Copilot work package |
| package | manual | No | Create manual work package |
| run | mock | Yes | Test without external dependencies |
| report | any | No | Review run or package artifacts |
# Bootstrap a local agent-job command
./install_agent_job.sh
# Then verify
agent-job --helpOne-command remote install:
curl -fsSL https://raw.githubusercontent.com/ngallodev-software/agent-job/main/install_agent_job_remote.sh | bashCopilot-native skill install with GitHub CLI:
gh skill preview ngallodev-software/agent-job agent-job --allow-hidden-dirs
gh skill install ngallodev-software/agent-job agent-job --allow-hidden-dirsThat installs only the repo skill payload into Copilot's skill location. From inside that installed skill, Copilot can use the bundled installer script to bootstrap the agent-job CLI when needed.
Requirements: Python 3, PyYAML
Direct invocation still works if you do not want to install a command:
python3 ./agent-job/scripts/agent-job --helpFor the Copilot model registry pipeline:
agent-job sync-modelsThis generates the current user's Copilot-specific registry at:
agent-job/references/copilot/available_models.copilot.jsonl
Customize preferences in:
agent-job/references/copilot/available-models.md
Then rerun:
agent-job sync-modelsIf you used the remote installer, the staged payload lives under ~/.local/share/agent-job by default:
agent-job sync-modelsAdvanced/manual fallback:
cd ~/.local/share/agent-job
npm install
npm run copilot:models:syncCreate a work package for GitHub Copilot:
# 1. Create a schema v2 job file (see examples/v2/)
# 2. Package for Copilot
agent-job package examples/v2/copilot-docs.job.yaml --target copilot
# 3. Open the generated prompt
cat runs/JOB-COPILOT-DOCS-001/*/prompt.copilot.md
# 4. Copy prompt to GitHub Copilot Chat or Copilot Workspace
# 5. Execute work in Copilot environment
# 6. Fill out report-template.md with results
# 7. Review diff and decide whether to commitOutput: runs/<job-id>/<timestamp>-copilot-package/
prompt.copilot.md- Paste into Copilotchecklist.md- Human review checklistreport-template.md- Completion templatejob.input.yaml- Original job specmeta.json- Package metadata
The eval suite runs from a local checkout of this repository.
Install for Copilot:
gh skill preview ngallodev-software/agent-job agent-job --allow-hidden-dirs
gh skill install ngallodev-software/agent-job agent-job --allow-hidden-dirs
curl -fsSL https://raw.githubusercontent.com/ngallodev-software/agent-job/main/install_agent_job_remote.sh | bashRun one eval task from the repo checkout:
python3 evals/copilot-run/self_check.py
TASK=01-docs-consistency
./agent-job/scripts/agent-job validate evals/copilot-run/tasks/$TASK/agent-job.job.yaml
./agent-job/scripts/agent-job package evals/copilot-run/tasks/$TASK/agent-job.job.yaml --target copilot
cat runs/*/*-copilot-package/prompt.copilot.mdThen use:
evals/copilot-run/tasks/$TASK/baseline-prompt.md- the generated
runs/.../prompt.copilot.md evals/copilot-run/tasks/$TASK/evaluator-prompt.md
to run the baseline pass, the agent-job pass, and the comparison.
Create a human-readable work order:
agent-job package examples/v2/manual-refactor.job.yaml --target manualUse the generated work order with any approved tool or agent.
Test the workflow without external dependencies:
agent-job run examples/v2/mock-test.job.yaml --executor mock
agent-job report runs/JOB-MOCK-TEST-001/*/For live local Codex execution, keep using the legacy codex-job runtime:
codex-job validate examples/bugfix.job.yaml
codex-job run examples/bugfix.job.yaml# Validate a job file
agent-job validate examples/v2/copilot-docs.job.yaml
# Render for specific target
agent-job render examples/v2/copilot-docs.job.yaml --target copilot
agent-job render examples/v2/manual-refactor.job.yaml --target manualSchema v2 is executor-neutral with organized sections:
schema_version: 2
id: JOB-2026-05-03-001
title: Job title
repo_path: /absolute/path
branch: null
task:
type: implementation | bugfix | refactor | test | docs | analysis
objective: What to accomplish
context: Background information
constraints: [rules to follow]
acceptance_criteria: [success conditions]
scope:
allowed_paths: [writable paths]
forbidden_paths: [forbidden paths]
execution:
mode: agent | human | ci
preferred_executor: copilot | human | codex | mock
model: optional-model-id
model_tier: very-low | low | medium | high
allowed_executors: [list]
disallowed_executors: [list]
commands_allowed: [list]
commands_forbidden: [list]
test_commands: [list]
output_contract:
require_summary: true
require_changed_files: true
require_tests_run: true
require_risks: true
human_review_required: true
provenance:
distinguish_agent_claims: true
require_changed_file_snapshot: true
require_test_evidence: true
created_by: human
created_at: 2026-05-03T00:00:00ZSee examples/v2/ for complete examples and agent-job/README.md for detailed documentation.
Model selection rule:
- if
execution.modelis set, use it - otherwise use
execution.model_tierif provided - otherwise choose the registry-backed default for Copilot packaging, preferring
mediumthenlow
Validate a job file:
agent-job validate <job.job.yaml>Render job to target-specific prompt:
agent-job render <job.job.yaml> --target <copilot|manual>Create work package without execution:
agent-job package <job.job.yaml> --target <copilot|manual>Execute job via specified executor:
agent-job run <job.job.yaml> --executor <mock> [--dry-run]Note: live codex execution is not yet implemented in agent-job. Use codex-job for Codex execution.
Print report for a run or package:
agent-job report <run-dir>GitHub Copilot Chat and Copilot Workspace are the approved execution environments in many organizations. The agent-job tool:
- ✅ Creates Copilot-ready prompts with clear scope and constraints
- ✅ Provides completion templates for consistent reporting
- ✅ Enables provenance tracking even for external execution
- ❌ Does NOT automate or bypass Copilot (no fake execution)
- ❌ Does NOT require Codex installation or auth
Do not assume every user sees the same Copilot models.
The repo includes a user-specific model sync pipeline:
- fetch raw SDK models for the current user
- save the raw JSON
- apply local preference overrides
- emit
available_models.copilot.jsonlfor selection use
Canonical files:
agent-job/references/copilot/available_models.copilot.jsonlagent-job/references/copilot/available-models.mdagent-job/references/copilot/README.md
- Create job file with
preferred_executor: copilot - Package:
agent-job package job.yaml --target copilot - Review prompt: Check
runs/.../prompt.copilot.mdfor clarity - Copy to Copilot: Paste into Copilot Chat or Workspace
- Execute: Let Copilot perform the work in approved environment
- Document: Fill out
report-template.mdwith Copilot's output - Review diff: Verify changes meet acceptance criteria
- Commit decision: Human decides whether to commit
Package mode metadata (meta.json):
{
"mode": "package",
"target": "copilot",
"launched_by_tool": false,
"process_success": null,
"exit_code": null
}The tool is honest about what it did and didn't do.
Schema v1 (codex-job format) is auto-migrated with warnings:
$ agent-job validate examples/bugfix.job.yaml
warning: schema v1 is deprecated; migrate to schema v2
valid: JOB-EXAMPLE-BUGFIXSee Migration Guide for details.
The codex-job CLI is the legacy Codex-specific runtime that still handles live Codex execution while agent-job focuses on Copilot/manual workflows.
| Feature | codex-job | agent-job |
|---|---|---|
| Identity | Codex-specific | Executor-neutral |
| Schema | v1 (flat) | v2 (structured) |
| Copilot support | No | Yes (package mode) |
| Manual support | No | Yes (package mode) |
| Provenance | claimed_by_codex |
claimed_by_agent |
| Auth requirement | Always Codex | None for package/mock paths |
| Render targets | One (Codex) | Copilot and manual |
- Keep using codex-job for Codex execution
- Use agent-job for new Copilot/manual workflows
- Convert jobs from schema v1 → v2 (or use auto-migration)
- Test with mock:
agent-job run job.yaml --executor mock - Wait on agent-job Codex support until it is actually implemented
See Migration Guide for complete instructions.
The original codex-job CLI remains available:
# Install (if not already installed)
./install_codex_job_skill.sh --scope project
# Use codex-job commands
codex-job validate examples/bugfix.job.yaml
codex-job run examples/bugfix.job.yamlRequirements: Python 3, PyYAML, Codex CLI (codex login)
See codex-job/ directory for legacy documentation.
The system distinguishes between:
- observed: Runner captured via git/fs/process
- declared_by_job: Job file specified
- claimed_by_agent: Agent reported (Copilot, Codex, mock, etc.)
- claimed_by_executor: Executor wrapper reported
- inferred: Derived from other data
- not_captured: Runner could not capture
- not_run: Not executed
- unknown: Indeterminate
Package mode is honest about non-execution:
launched_by_tool: falseprocess_success: null- No fabricated exit codes
- Strict fail-closed validation
- Absolute repo paths required
- Allowed/forbidden path enforcement (prompt-based)
- No auto-commit, no auto-push
- Human review required for all workflows
- Package mode honest about non-execution
- No shell callbacks or dangerous operations
- No Copilot automation bypass
agent-job/ # Universal architecture
scripts/
agent-job, agent_job_cli.py, schema.py
executors/
base_executor.py, mock_executor.py, codex_executor.py
renderers/
base_renderer.py, copilot_renderer.py, manual_renderer.py, codex_renderer.py
codex-job/ # Legacy Codex-specific
scripts/codex-job, codex_job_cli.py
examples/
v2/ # Schema v2 examples
copilot-docs.job.yaml
manual-refactor.job.yaml
mock-test.job.yaml
*.job.yaml # Schema v1 examples (legacy)
See Architecture Documentation for details.
Schema v2 examples (recommended):
examples/v2/copilot-docs.job.yaml- Copilot workflowexamples/v2/manual-refactor.job.yaml- Manual workflowexamples/v2/mock-test.job.yaml- Mock testing
Schema v1 examples (legacy, auto-migrated):
examples/bugfix.job.yamlexamples/refactor.job.yamlexamples/docs.job.yaml
- agent-job README - Detailed agent-job documentation
- Migration Guide - Schema v1 → v2 migration
- Architecture - System architecture
- Safety Model - Safety and validation
- Phase A Report - Implementation details
Phase A limitations:
- agent-job Codex execution: Not yet implemented
- Completion ingestion: Not yet implemented
- Claude renderer: Not yet implemented
- Git integration: Partial (from codex-job, not fully migrated)
See Phase A Report for details.
See CONTRIBUTING.md for guidelines.
See SECURITY.md for security policies.
See LICENSE file.
For issues or questions, see the repository issue tracker or documentation.