Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/autopilot-create-issue.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Create or update issue
uses: actions/github-script@v7
uses: actions/github-script@f28e40c7f34bde8b3046d885e986cb6290c5673b # v7
with:
script: |
const run = context.payload.workflow_run;
Expand Down
6 changes: 4 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,13 @@ jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
with:
python-version: "3.12"
- name: Syntax check
run: python -m py_compile agent/poll_once.py
- name: Import check
run: python -c "import agent.poll_once"
- name: Unit tests
run: python -m unittest discover -v
18 changes: 9 additions & 9 deletions .github/workflows/fixer.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: CI Autopilot Fixer
name: CI Autopilot Fixer

on:
workflow_dispatch:
Expand All @@ -22,6 +22,12 @@ jobs:
GH_TOKEN: ${{ github.token }}
GITHUB_TOKEN: ${{ github.token }}
steps:
- name: Checkout clean workspace
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
clean: true
fetch-depth: 1

- name: Preflight diagnostics
run: |
New-Item -ItemType Directory -Force logs | Out-Null
Expand All @@ -36,12 +42,6 @@ jobs:
if (Get-Command gh -ErrorAction SilentlyContinue) { gh --version 2>&1 | Out-File -FilePath logs\preflight.log -Append }
"C:\\src\\ci-autopilot exists: $(Test-Path C:\\src\\ci-autopilot)" | Out-File -FilePath logs\preflight.log -Append

- name: Checkout
uses: actions/checkout@v4
with:
clean: false
fetch-depth: 1

- name: Python diagnostics
run: |
if (Get-Command python -ErrorAction SilentlyContinue) {
Expand Down Expand Up @@ -76,7 +76,7 @@ jobs:

- name: Upload logs
if: always()
uses: actions/upload-artifact@v4
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4
with:
name: ci-autopilot-logs
path: logs\*.log
path: logs\*.log
9 changes: 6 additions & 3 deletions .github/workflows/runner-health.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@ name: Runner Health Monitor

on:
workflow_dispatch:
schedule:
- cron: "*/15 * * * *"
Comment on lines +5 to +6

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Gate scheduled alerts on runner-monitor credentials

Adding this schedule activates the monitor every 15 minutes even when RUNNER_PAT is absent. In that default configuration, GH_TOKEN falls back to github.token, the runner-list request can fail due to insufficient runner-administration permission, and lines 26–27 convert that failure to unknown; the notification step then opens an issue and adds another comment on every scheduled run under the misleading Runner offline title. Gate scheduled notifications on a configured credential or handle unknown as a configuration error without repeated offline alerts.

Useful? React with 👍 / 👎.


permissions:
actions: read
Expand All @@ -14,6 +16,7 @@ jobs:
env:
GH_TOKEN: ${{ secrets.RUNNER_PAT || github.token }}
RUNNER_NAME: MyLocalPC
EMAIL_TO: ${{ secrets.EMAIL_TO }}
steps:
- name: Check runner status
id: status
Expand Down Expand Up @@ -59,8 +62,8 @@ jobs:
fi

- name: Email notification
if: steps.status.outputs.runner_status != 'online' && env.SMTP_SERVER != ''
uses: dawidd6/action-send-mail@v3
if: steps.status.outputs.runner_status != 'online' && env.SMTP_SERVER != '' && env.EMAIL_TO != ''
uses: dawidd6/action-send-mail@4226df7daafa6fc901a43789c49bf7ab309066e7 # v3
env:
SMTP_SERVER: ${{ secrets.SMTP_SERVER }}
SMTP_PORT: ${{ secrets.SMTP_PORT }}
Expand All @@ -77,5 +80,5 @@ jobs:
Runner: MyLocalPC
Repo: ${{ github.repository }}
Status: ${{ steps.status.outputs.runner_status }}
to: ogeonx@gmail.com
to: ${{ env.EMAIL_TO }}
from: ${{ env.EMAIL_FROM }}
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
__pycache__/
*.py[cod]
.venv/
logs/
53 changes: 53 additions & 0 deletions .planning/audits/20260610-ci-autopilot-audit-fix.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# CI Autopilot Audit-Fix Report

Date: 2026-06-10
Source: gsd-audit-fix (fresh repository audit; no existing UAT phase artifacts)
Arguments: `--severity all --max 8`
Scope: worker/runtime correctness, security boundaries, CI, tests, packaging, reliability, documentation

## Classification

| ID | Finding | Severity | Classification | Result |
|---|---|---|---|---|
| F-01 | Persistent self-hosted checkout retained prior-job files | high | auto-fixable | Fixed; clean checkout moved before log creation |
| F-02 | External GitHub Actions used mutable tags | high | auto-fixable | Fixed; pinned to immutable commits |
| F-03 | Worker accepted malformed repo/API inputs and had no boundary tests | high | auto-fixable | Fixed; validation, error wrapping, response checks, tests |
| F-04 | CI did not execute tests and local tests left generated artifacts | medium | auto-fixable | Fixed; discovery in CI and ignore rules |
| F-05 | Bootstrap installed Python 3.11 while CI/runtime require 3.12 | medium | auto-fixable | Fixed |
| F-06 | Runner health monitor was manual-only despite 15-minute runbook claim | medium | auto-fixable | Fixed; schedule added |
| F-07 | Runner-health email recipient was hard-coded personal data | medium | auto-fixable | Fixed; `EMAIL_TO` secret |
| F-08 | Docs claimed autonomous dispatch not present in implementation | high | auto-fixable | Fixed; read-only contract documented |
| F-09 | Failure-intake shell interpolates `workflow_run` fields directly into Bash | high | auto-fixable | Not attempted; `--max 8` reached |
| F-10 | Two failure-intake workflows overlap with inconsistent deduplication and labels | medium | manual-only | Choose and migrate to one canonical intake contract |
| F-11 | Guarded autonomous repair dispatcher and queue state machine are absent | high | manual-only | Requires trust policy, sandbox, allowlists, PR-only output, authorization |
| F-12 | Persistent runner host hardening and isolation controls are not implemented as code | high | manual-only | Requires deployment architecture and host policy |
| F-13 | Runner identity is hard-coded as `MyLocalPC` | medium | auto-fixable | Not attempted; `--max 8` reached |
| F-14 | Runner-health authentication depends on a manually scoped PAT | medium | manual-only | Prefer GitHub App or centrally governed fine-grained identity |
| F-15 | Scheduled fixer has no concurrency guard | medium | auto-fixable | Not attempted; `--max 8` reached |
| F-16 | Queue inventory is limited to the first API page and displays only five items | low | manual-only | Define inventory/processing pagination contract first |

## Fixed Commits

- F-01: `ae2f052`, `dd4cc15`
- F-02: `b622adf`
- F-03: `364f4a2`
- F-04: `53ae712`, `9385ee0`
- F-05: `8a5c893`
- F-06: `aae07d2`
- F-07: `4da6055`
- F-08: `21b8065`, `b129f3a`

## Verification

- `python -m unittest discover -v`: 6 passed
- `python -m py_compile agent/poll_once.py`: passed
- `python -m compileall -q agent tests`: passed
- `git diff --check`: passed
- Existing PR: https://github.com/Coding-Autopilot-System/ci-autopilot/pull/1965

## Manual Gate

Do not grant worker write permissions or execute issue content until F-11 and F-12 have approved designs and testable controls.
## Delivery Status

Local branch tip is ready, but HTTPS push is blocked because the configured OAuth credential lacks the GitHub workflow scope. SSH authentication is not configured. PR #1965 therefore remains at its prior remote tip until credentials are refreshed.
2 changes: 1 addition & 1 deletion Machinesetup.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Write-Host ("winget: " + ($(if($hasWinget){"OK"}else{"MISSING"})))
# Tools we want
$want = @(
@{name="git"; check="git"; winget="Git.Git"},
@{name="python"; check="python"; winget="Python.Python.3.11"},
@{name="python"; check="python"; winget="Python.Python.3.12"},
@{name="node"; check="node"; winget="OpenJS.NodeJS.LTS"},
@{name="npm"; check="npm"; winget="OpenJS.NodeJS.LTS"},
@{name="gh"; check="gh"; winget="GitHub.cli"}
Expand Down
30 changes: 16 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,21 @@
# ci-autopilot

AI-powered CI autopilot worker/runtime - detects GitHub Actions failures and dispatches repairs via a Python agent
CI autopilot worker/runtime - detects GitHub Actions failures and inventories queued repair issues via a Python agent

[![CI](https://github.com/Coding-Autopilot-System/ci-autopilot/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/Coding-Autopilot-System/ci-autopilot/actions/workflows/ci.yml)
[![Python 3.12](https://img.shields.io/badge/python-3.12-blue.svg)](https://www.python.org/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)

## Overview

`ci-autopilot` packages the worker/runtime side of the platform. It monitors GitHub Actions workflows, detects failures, triages them via an issue queue, and dispatches autonomous repairs using a Python agent backed by Codex. It runs on a self-hosted runner and provides the execution path from queued repair task to proposed fix.
`ci-autopilot` packages the worker/runtime side of the platform. It monitors GitHub Actions workflows, detects failures, and exposes them through an issue queue. The current Python worker is deliberately read-only: it inventories queued issues on a self-hosted runner. Autonomous repair dispatch and queue state transitions are not implemented yet.

The agent (`agent/poll_once.py`) is a Python 3.12 stdlib-only program that polls the issue queue, picks up queued repair tasks, and dispatches them to the Codex repair pipeline. No external dependencies are required.
The agent (`agent/poll_once.py`) is a Python 3.12 stdlib-only program that polls the issue queue and lists queued repair tasks for operator visibility. No external dependencies are required.

## Repo boundary

- `autopilot-core` is the control plane for org-level scheduling, rollout, and PR governance.
- `ci-autopilot` is the worker/runtime implementation for runner execution, queue polling, and repair dispatch.
- `ci-autopilot` is the worker/runtime implementation for runner execution and read-only queue polling.
- `autopilot-demo` is the demonstration target used to show the runtime and control plane working together.

## Architecture
Expand All @@ -25,34 +25,36 @@ flowchart LR
A["GitHub Actions\nfailure detected"] --> B["autopilot-failure-intake\n(intake workflow)"]
B --> C["Issue queue\n(runner-offline label)"]
C --> D["agent/poll_once.py\n(Python 3.12)"]
D --> E["Codex\n(repair dispatch)"]
E --> F["PR / fix\ncommitted"]
D --> E["Operator visibility\n(read-only inventory)"]
E -. "future guarded dispatcher" .-> F["PR-only repair path"]
```

**Core components:**

- **autopilot-failure-intake.yml** - Intake workflow triggered on `workflow_run` failure events; creates a queued issue
- **autopilot-create-issue.yml** - Creates GitHub issues via `actions/github-script` when monitored workflows fail
- **fixer.yml** - Main CI autopilot; runs `agent/poll_once.py` on the self-hosted Windows runner
- **agent/poll_once.py** - Python 3.12 stdlib agent; polls the issue queue and dispatches repairs
- **fixer.yml** - Runs the read-only `agent/poll_once.py` queue inventory on the self-hosted Windows runner
- **agent/poll_once.py** - Python 3.12 stdlib agent; validates repository input and inventories the issue queue
- **runner-smoke-test.yml** - Smoke tests the self-hosted runner on demand
- **runner-health.yml** - Manual runner health check (dispatch only)
- **runner-health.yml** - Scheduled and on-demand runner health check

## Enterprise proof points

- Deliberately small runtime surface: Python 3.12 stdlib-only agent for easier audit and rebuild.
- Clear separation of concerns: issue intake and governance stay in the control plane; repair execution stays on the worker.
- Clear separation of concerns: issue intake and governance stay in the control plane; worker execution and future guarded repair dispatch stay on the worker boundary.
- Self-hosted runner model supports enterprise network boundaries, managed toolchains, and least-privilege token handling.
- Queue-driven processing creates an auditable handoff between CI failure detection and agent action.
- Queue-driven intake creates an auditable handoff between CI failure detection and operator review.

## Quick Start

```bash
# Prerequisites: Python 3.12, GitHub CLI, GH_TOKEN env var
export GH_TOKEN=<your_token>
```pwsh
# Prerequisites: Python 3.12 and authenticated GitHub CLI
$env:GH_TOKEN = gh auth token
python -m agent.poll_once
```

The worker only reads and lists queued issues. It does not execute issue content, mutate repositories, or dispatch Codex.

For full runner registration, service setup, and local development instructions see the [Setup Guide](https://github.com/Coding-Autopilot-System/ci-autopilot/wiki/Setup-Guide) wiki page.

## Runbook path
Expand Down
60 changes: 48 additions & 12 deletions agent/poll_once.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,16 @@
import json
import os
import re
import subprocess
import sys
import urllib.error
import urllib.parse
import urllib.request
from typing import Any

_REPO_SEGMENT = re.compile(r"[A-Za-z0-9_.-]+")


def _run(cmd: list[str], timeout: int = 20) -> str:
try:
proc = subprocess.run(cmd, capture_output=True, text=True, timeout=timeout)
Expand All @@ -25,10 +30,12 @@ def _run(cmd: list[str], timeout: int = 20) -> str:
raise RuntimeError(f"Command failed: {' '.join(cmd)}\n{detail_text}")
return proc.stdout.strip()


def _github_token() -> str | None:
token = os.getenv("GITHUB_TOKEN") or os.getenv("GH_TOKEN")
return token.strip() if token else None


def _gh_api_http(path: str, fields: dict[str, str] | None = None, timeout: int = 20) -> Any:
token = _github_token()
if not token:
Expand All @@ -45,7 +52,13 @@ def _gh_api_http(path: str, fields: dict[str, str] | None = None, timeout: int =
except urllib.error.HTTPError as exc:
err_body = exc.read().decode("utf-8") if exc.fp else ""
raise RuntimeError(f"GitHub API error {exc.code}: {err_body}") from exc
return json.loads(body) if body else None
except urllib.error.URLError as exc:
raise RuntimeError(f"GitHub API request failed: {exc.reason}") from exc
try:
return json.loads(body) if body else None
except json.JSONDecodeError as exc:
raise RuntimeError("GitHub API returned invalid JSON.") from exc


def _gh_api_json(path: str, fields: dict[str, str] | None = None, timeout: int = 20) -> Any:
token = _github_token()
Expand All @@ -58,21 +71,39 @@ def _gh_api_json(path: str, fields: dict[str, str] | None = None, timeout: int =
for key, value in fields.items():
cmd.extend(["-f", f"{key}={value}"])
output = _run(cmd, timeout=timeout)
return json.loads(output) if output else None
try:
return json.loads(output) if output else None
except json.JSONDecodeError as exc:
raise RuntimeError("gh CLI returned invalid JSON.") from exc


def _validate_repo_segment(value: str, name: str) -> str:
value = value.strip()
if not value or not _REPO_SEGMENT.fullmatch(value):
raise RuntimeError(f"Invalid GitHub {name}: {value!r}")
return value


def _repo_from_env() -> tuple[str, str]:
repo_full = os.getenv("GITHUB_REPOSITORY", "").strip()
if repo_full and "/" in repo_full:
if repo_full:
if repo_full.count("/") != 1:
raise RuntimeError(f"Invalid GitHub repository: {repo_full!r}")
owner, repo = repo_full.split("/", 1)
return owner, repo
owner = os.getenv("GITHUB_OWNER", "Coding-Autopilot-System").strip()
repo = os.getenv("GITHUB_REPO", "ci-autopilot").strip()
return owner, repo
else:
owner = os.getenv("GITHUB_OWNER", "Coding-Autopilot-System")
repo = os.getenv("GITHUB_REPO", "ci-autopilot")
return _validate_repo_segment(owner, "owner"), _validate_repo_segment(repo, "repository")


def main() -> int:
owner, repo = _repo_from_env()
try:
owner, repo = _repo_from_env()
except RuntimeError as exc:
print(f"ERROR: {exc}")
return 1
print(f"CI Autopilot poll_once starting for {owner}/{repo}")
print("Listing queued issues via gh api...")
print("Listing queued issues via GitHub API...")
try:
issues = _gh_api_json(
f"/repos/{owner}/{repo}/issues",
Expand All @@ -82,8 +113,12 @@ def main() -> int:
print(f"ERROR: {exc}")
return 1

issues = issues or []
queued = [it for it in issues if "pull_request" not in it]
if issues is None:
issues = []
if not isinstance(issues, list):
print("ERROR: GitHub API returned an unexpected issues response.")
return 1
queued = [it for it in issues if isinstance(it, dict) and "pull_request" not in it]
print(f"Found {len(queued)} queued issues")
for it in queued[:5]:
num = it.get("number")
Expand All @@ -95,5 +130,6 @@ def main() -> int:
print("poll_once complete")
return 0


if __name__ == "__main__":
raise SystemExit(main())
raise SystemExit(main())
6 changes: 3 additions & 3 deletions docs/architecture.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
# Architecture

## Overview
CI Autopilot is a runner-hosted automation layer that coordinates Codex-driven workflows, issue triage, and safe remediation.
CI Autopilot is a runner-hosted automation layer that coordinates failure intake, queued-issue inventory, and operator visibility. Guarded remediation dispatch is a future capability.

## Core components
- Runner host: Windows service running GitHub Actions self-hosted runner
- Workflow layer: GitHub Actions workflows that trigger repair or hygiene pipelines
- Agent runtime: Python agent that executes tasks and reports status
- Agent runtime: read-only Python agent that validates repository context and inventories queued issues
- Control plane: Repository configuration and issue workflow conventions
- Logs and artifacts: Local logs and GitHub Action run outputs

## Data flow (high level)
1) Event triggers a workflow (dispatch, schedule, or issue activity)
2) Workflow schedules a job to the self-hosted runner
3) Runner executes the agent or scripts in a controlled workspace
3) Runner executes the read-only queue poller in a cleaned workspace
4) Artifacts and logs are uploaded for audit and review

## Design goals
Expand Down
Loading
Loading