Skip to content

[DevOps][Advanced] Generated playwright-report/index.html (529 KB) is committed to the repo — gitignore and CI artifact guard missing #264

@MehtabSandhu11

Description

@MehtabSandhu11

Summary

frontend/playwright-report/index.html (518 KB) is currently tracked and committed in
the repository. A check-artifacts.sh script and a CI workflow exist to prevent exactly
this, but both have a structural blind spot that allowed this to slip in and prevents it
from being caught going forward.

Evidence

# File is actively tracked by git
$ ls -lh frontend/playwright-report/index.html
-rw-r--r--  518K  frontend/playwright-report/index.html

# The path is NOT in .gitignore
$ grep "playwright-report" .gitignore
(no output)

Why the Existing Guard Didn't Catch It

.github/workflows/check-artifacts.yml runs scripts/check-artifacts.sh on pull
requests. The script works by diffing changed files against the base branch:

CHANGED_FILES=$(git diff --name-only "${BASE_BRANCH}"...HEAD)

This means it only inspects files modified in a PR, not files already committed to
main. Once a generated artifact lands in the base branch — as has happened here — it
becomes invisible to the guard forever. Every future PR passes the check cleanly, even
though the 518 KB blob sits in history and is checked out by every contributor.

The second gap is the missing .gitignore entry. Without it, nothing stops a contributor
from staging frontend/playwright-report/ locally and including it in a commit. The CI
guard only fires after the PR is opened, not at git add time.

CONTRIBUTING.md is explicit on both counts:

Never commit these auto-generated paths:
frontend/playwright-report/
frontend/test-results/

If CI fails, run: git rm --cached <file>

Impact

  1. Silent repo bloat — 518 KB of generated HTML is added to every fresh clone and
    re-added each time a contributor accidentally stages a new report run.
  2. Guard gives false confidence — CI reports "All clear!" on every PR while the
    offending file sits committed on main.
  3. Potential data leak — Playwright HTML reports embed full request/response traces.
    A report generated against a real scan target and accidentally committed could expose
    sensitive scan output in public git history.

Proposed Fix

Step 1 — Remove the committed file:

git rm --cached frontend/playwright-report/index.html
git commit -m "chore: untrack committed playwright report artifact"

Step 2 — Add the missing .gitignore entries so git add rejects these locally
before they ever reach a PR:

# Frontend test artifacts (see CONTRIBUTING.md)
frontend/playwright-report/
frontend/test-results/
frontend/.vite/

Step 3 — Fix check-artifacts.sh to also check already-tracked files, not just PR
diffs. Add this block before the diff check:

echo "Checking for tracked generated artifacts..."
TRACKED_FOUND=()
for pattern in "${BLOCKED_PATTERNS[@]}"; do
  while IFS= read -r match; do TRACKED_FOUND+=("$match")
  done < <(git ls-files "frontend/${pattern#frontend/}" 2>/dev/null || true)
done

if [[ ${#TRACKED_FOUND[@]} -gt 0 ]]; then
  echo "ERROR: Generated artifact is tracked by git:"
  for f in "${TRACKED_FOUND[@]}"; do echo "  - $f"; done
  echo "Fix: git rm --cached <file> && add to .gitignore"
  exit 1
fi

This makes the guard catch both newly added files (existing diff check) and files already
sitting in history (new tracked-files check), closing the blind spot completely.

Severity

Advanced — the existing CI safety net has a structural gap that silently permits exactly
the class of commits it was designed to block, and the evidence of that gap is already
committed to main.

Please assign this to me under GSSoC

Metadata

Metadata

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions