fix: enforce mandatory API key in production (fail-closed) by SHAURYASANYAL3 · Pull Request #104 · sreerevanth/AgentWatch

SHAURYASANYAL3 · 2026-05-30T11:10:13Z

Hardens API authentication to ensure a fail-closed posture in production environments when AGENTWATCH_API_KEY is missing. This fix ensures that production deployments refuse unauthorized access by default if misconfigured. Closes #103

Summary by CodeRabbit

Bug Fixes
- Production now enforces a configured API key: protected requests are rejected if the key is missing or incorrect, and the application will refuse to start in misconfigured production deployments.
Documentation
- Clarified env examples and getting-started docs: default environment is development and production uses fail-closed authentication behavior.
Chores
- CI workflow permissions updated to allow workflow steps to create/update PR report comments.

Closes sreerevanth#103

coderabbitai · 2026-05-30T11:10:23Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

This PR implements fail-closed API key authentication for production deployments, adding environment detection and startup/runtime guards in the API server, updating docs and .env.example, and granting issues: write to the test workflow.

Changes

Fail-Closed Authentication in Production

Layer / File(s)	Summary
Fail-closed API authentication enforcement `agentwatch/api/server.py`	Environment detection identifies production mode. The `_require_api_key` dependency rejects protected requests with HTTP 500 when production is detected without `AGENTWATCH_API_KEY`. The application lifespan handler logs a critical message and raises `RuntimeError` on startup when this misconfiguration occurs.
Getting started security documentation `docs/getting-started.md`	Updated the default `AGENTWATCH_ENV` to `development`, added a "Security & Authentication" section documenting fail-closed production behavior and the `500 Server Misconfiguration` response, and removed the local "Optional PagerDuty Webhook" subsection from that location.
Environment variable example `.env.example`	Comment updated to clarify `AGENTWATCH_API_KEY` is mandatory in production (`AGENTWATCH_ENV=production`) and that requests are unauthenticated when the key is empty in development.
CI workflow permissions update `.github/workflows/test-on-pr.yml`	Added `issues: write` to workflow permissions so the Actions step can create/update PR comments for test/reporting.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant FastAPI_App as FastAPI App
  participant AuthDep as _require_api_key
  participant Env as Environment (AGENTWATCH_ENV / AGENTWATCH_API_KEY)
  Client->>FastAPI_App: request to protected endpoint (X-Api-Key header)
  FastAPI_App->>AuthDep: invoke dependency
  AuthDep->>Env: read AGENTWATCH_ENV / AGENTWATCH_API_KEY
  alt production && API key missing
    AuthDep-->>FastAPI_App: raise HTTP 500 (Server misconfiguration)
  else API key configured
    AuthDep-->>FastAPI_App: validate header -> 401 or allow
  end
  note over FastAPI_App,Env: On startup lifespan checks Env and raises RuntimeError if production and key missing

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

sreerevanth/AgentWatch#65: Prior implementation of the _require_api_key guard; this PR hardens it with fail-closed production enforcement.
sreerevanth/AgentWatch#71: Related updates to .env.example and getting-started docs around AGENTWATCH_API_KEY/AGENTWATCH_ENV.

Suggested labels

security, bug, level: advanced, level3, documentation

Poem

🐰 I hop the code and hold the key,
In production mode, no gate swings free.
Startup checks and auths hold tight,
Docs updated for guiding light.
A rabbit's nod — secure tonight.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main objective: enforcing mandatory API key in production with fail-closed behavior, which is precisely what the changeset implements.
Linked Issues check	✅ Passed	The PR implements all core objectives from issue `#103`: detects production via AGENTWATCH_ENV/ENVIRONMENT, prevents startup when API key is missing in production, returns 500 on misconfiguration and 401 on invalid keys, preserves development usability, and updates documentation accordingly.
Out of Scope Changes check	✅ Passed	All changes directly support the fail-closed authentication objective: server.py enforces the policy, .env.example and docs clarify the behavior, and the workflow fix addresses CI permission issues needed to support the PR itself.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…maintaining production fail-closed requirement

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@agentwatch/api/server.py`:
- Around line 355-360: When running in production, do not merely log a critical
error; fail startup when AGENTWATCH_API_KEY is missing by raising an exception
or exiting the process in the check that uses _IS_PROD and _API_KEY. Modify the
block referencing _IS_PROD, _API_KEY and logger (in server.py) so that after
logging the critical message you raise a RuntimeError (or call sys.exit(1)) with
a clear message like "AGENTWATCH_API_KEY is required in production" to stop the
server from starting and ensure the _require_api_key dependency is not relied on
to surface the misconfiguration later.
- Around line 175-176: The code currently defaults _ENV to "production" when
AGENTWATCH_ENV/ENVIRONMENT is unset which forces prod behavior; change the
default to a non-production value (e.g., "development" or empty string) when
computing _ENV and keep _IS_PROD = _ENV.lower() == "production" so
production-only behavior only occurs if the env var explicitly equals
"production"; update the _ENV/_IS_PROD initialization (references: _ENV and
_IS_PROD) accordingly to avoid implicit production defaults.

In `@docs/getting-started.md`:
- Around line 214-221: The docs show a fail-closed production rule but the
earlier defaults snippet leaves AGENTWATCH_ENV=production without an API key,
causing protected endpoints to 500; update the earlier defaults snippet so it
either sets AGENTWATCH_ENV=development or provides an explicit
AGENTWATCH_API_KEY value (or a placeholder like YOUR_DEV_KEY) to avoid
misconfigured production behavior; ensure the change references the same
environment variables AGENTWATCH_ENV and AGENTWATCH_API_KEY so readers see
consistent quickstart defaults and the Security & Authentication section aligns.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 17666e8d-8218-4f31-b716-f8ac7a589ede

📥 Commits

Reviewing files that changed from the base of the PR and between 55545f0 and 0bfdb39.

📒 Files selected for processing (3)

.env.example
agentwatch/api/server.py
docs/getting-started.md

…API key and ensure docs consistency

SHAURYASANYAL3 · 2026-05-30T11:29:35Z

@sreerevanth
I noticed the workflow is failing with:

403 Resource not accessible by integration

It looks like the tests run, but the action fails when trying to post the PR summary comment. Since this PR comes from a fork, I think the GITHUB_TOKEN may not have issue/comment write permissions. Could you check if the workflow needs special handling for fork-based PRs?

Adds issues: write permission required to post comments on pull requests.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

.github/workflows/test-on-pr.yml (1)
18-19: 💤 Low value

Consider disabling credential persistence.

Static analysis flags that persist-credentials defaults to true, which stores the Git token in .git/config. While this workflow doesn't upload artifacts, explicitly setting persist-credentials: false is a defense-in-depth measure if artifacts are added later.
🔒 Optional hardening
       - name: Checkout
         uses: actions/checkout@v4
+        with:
+          persist-credentials: false
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/test-on-pr.yml around lines 18 - 19, Update the Checkout
step that uses actions/checkout@v4 to explicitly disable credential persistence
by adding persist-credentials: false to the step configuration (the step with
name: Checkout and uses: actions/checkout@v4) so the GitHub token is not stored
in .git/config as a defense-in-depth measure.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/test-on-pr.yml:
- Around line 7-11: The workflow uses on: pull_request and requests issues:
write / pull-requests: write but GitHub makes GITHUB_TOKEN read-only for forked
PRs, causing actions/github-script steps that call
issues.createComment/updateComment to 403; fix by either enabling the repo
setting "Send write tokens to workflows from fork pull requests", or change the
workflow trigger to pull_request_target and ensure you do NOT check out or run
untrusted code from the forked head (keep any checkout/use of PR code guarded),
or restrict the workflow to same-repo PRs only (keep on: pull_request but accept
write actions will fail for forks). Update the permissions block and any
documentation/comments in the workflow accordingly and ensure the
actions/github-script step references are adjusted to the chosen approach.

---

Nitpick comments:
In @.github/workflows/test-on-pr.yml:
- Around line 18-19: Update the Checkout step that uses actions/checkout@v4 to
explicitly disable credential persistence by adding persist-credentials: false
to the step configuration (the step with name: Checkout and uses:
actions/checkout@v4) so the GitHub token is not stored in .git/config as a
defense-in-depth measure.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: e14aac9c-602f-4d62-bb3f-4e92b8ba6e00

📥 Commits

Reviewing files that changed from the base of the PR and between 9a6cce0 and 3044a99.

📒 Files selected for processing (1)

.github/workflows/test-on-pr.yml

This commit switches the PR test workflow to pull_request_target to allow status comments on forked PRs, while adding a security guard to prevent untrusted code from accessing repo secrets. It also includes general style and lint cleanups across the test suite.

github-actions · 2026-05-31T09:33:23Z

🧪 PR Test Results

Check	Result
Tests (`pytest tests/`)	✅ success
Lint (`ruff check .`)	✅ success
Coverage (`agentwatch`)	71.62%

_{Python 3.12 · commit 7ab41b4}

This commit applies missing # noqa suppressions to test files and fixes import sorting/formatting in tests/test_models.py, ensuring the CI lint check passes.

sreerevanth

Thanks for the contribution.

This improves the default security posture by enforcing fail-closed behavior in production while preserving a smooth local development experience. The workflow hardening and documentation updates are also appreciated.

Approved.

fix: enforce mandatory API key in production (fail-closed)

0bfdb39

Closes sreerevanth#103

fix: change default environment to development to fix CI tests while …

55964db

…maintaining production fail-closed requirement

coderabbitai Bot reviewed May 30, 2026

View reviewed changes

Comment thread agentwatch/api/server.py Outdated

Comment thread agentwatch/api/server.py Outdated

Comment thread docs/getting-started.md

SHAURYASANYAL3 added 2 commits May 30, 2026 16:49

fix: address PR review feedback - fail startup in production without …

a28ca3d

…API key and ensure docs consistency

style: fix ruff formatting in server.py

9a6cce0

SHAURYASANYAL3 added 2 commits May 30, 2026 19:39

Merge origin/main to bring in CI fixes

5dc77ec

ci: fix github actions permission for posting PR comments

3044a99

Adds issues: write permission required to post comments on pull requests.

coderabbitai Bot reviewed May 30, 2026

View reviewed changes

Comment thread .github/workflows/test-on-pr.yml

SHAURYASANYAL3 and others added 4 commits May 31, 2026 15:04

Merge branch 'main' into fix/insecure-default-auth

f5dc8a1

chore: resolve remaining linting and formatting issues

a14b46f

This commit applies missing # noqa suppressions to test files and fixes import sorting/formatting in tests/test_models.py, ensuring the CI lint check passes.

chore: resolve merge conflict and fix lint

11b0607

chore: final lint and formatting fixes

d4acd91

sreerevanth approved these changes May 31, 2026

View reviewed changes

sreerevanth merged commit 22bfa2c into sreerevanth:main May 31, 2026
8 checks passed

Conversation

SHAURYASANYAL3 commented May 30, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SHAURYASANYAL3 commented May 30, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🧪 PR Test Results

Uh oh!

sreerevanth left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

SHAURYASANYAL3 commented May 30, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 30, 2026 •

edited

Loading

github-actions Bot commented May 31, 2026 •

edited

Loading