Skip to content

Conversation

@DavertMik
Copy link
Contributor

@DavertMik DavertMik commented Feb 6, 2026

User description

Summary

  • Add fast heuristic check to detect error pages (404, 500, 502, 503, 403) before performing full AI research
  • Check page title, h1, h2 for error patterns using regex
  • Detect empty or very small pages (< 500 chars body content)
  • Add AI prompt instruction as fallback to catch custom error pages semantically

Test plan

  • Test with pages containing "404" in title → should return early with error message
  • Test with pages containing "500" in h1 → should return early
  • Test with empty body pages → should return early
  • Test with very small pages (<500 chars) → should return early
  • Test with valid pages → should proceed to normal research
  • Test AI detection of custom error pages (heuristic passes, AI catches it)

🤖 Generated with Claude Code


CodeAnt-AI Description

Detect error pages and stop research when a page is an error

What Changed

  • Researcher now checks the loaded page for common error signals (404/500/502/503/403 keywords in title/h1/h2, empty or very small body) and immediately stops research when an error page is detected
  • When an error is detected, the researcher returns a short standardized "Error Page Detected" message that includes the error type, URL, and page title instead of running the normal research flow
  • The AI research prompt now instructs the model to recognize custom error pages and respond only with the standardized error output when applicable

Impact

✅ Fewer wasted AI calls on error pages
✅ Faster response when navigating to broken or empty pages
✅ Clearer feedback that research was skipped due to an error page

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

Skip full AI-powered research when navigating to error pages by detecting
them early. Uses a two-layer approach:

1. Fast heuristic check (zero cost):
   - Checks title, h1, h2 for error patterns (404, 500, 502, 503, 403)
   - Detects empty body HTML
   - Catches very small pages (< 500 chars in body)

2. AI prompt instruction as fallback:
   - Instructs AI to detect custom error pages semantically
   - Returns standardized error format instead of normal research

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@codeant-ai
Copy link

codeant-ai bot commented Feb 6, 2026

CodeAnt AI is reviewing your PR.

@codeant-ai codeant-ai bot added the size:M This PR changes 30-99 lines, ignoring generated files label Feb 6, 2026
@codeant-ai
Copy link

codeant-ai bot commented Feb 6, 2026

Sequence Diagram

The PR adds a fast heuristic error-page check before running full AI research and includes an AI prompt fallback to detect custom error pages. The diagram shows the short-circuiting behavior when an error is detected and the normal research path when it is not.

sequenceDiagram
    participant Client
    participant Researcher
    participant ErrorDetector
    participant AI

    Client->>Researcher: Request research for URL / state
    Researcher->>ErrorDetector: Heuristic check (title, h1, h2, body size/empty)
    alt Heuristic detects error
        ErrorDetector-->>Researcher: isError (e.g., 404/500/empty)
        Researcher-->>Client: "Error Page Detected" (skip AI research)
    else Heuristic passes
        ErrorDetector-->>Researcher: not an error
        Researcher->>AI: Start conversation with research prompt (includes error_detection fallback)
        AI-->>Researcher: Research report (or semantic error if AI identifies it)
        Researcher-->>Client: Return research result
Loading

Generated by CodeAnt AI

@codeant-ai
Copy link

codeant-ai bot commented Feb 6, 2026

Nitpicks 🔍

🔒 No security issues identified
⚡ Recommended areas for review

  • AI instruction reliability
    The newly added in-prompt instruction asks the model to "respond ONLY" with a short header when an error page is detected. LLM responses are not guaranteed to strictly follow this; the current prompt does not enforce a machine-parseable format (e.g., JSON) or provide fallback parsing guidance, which may lead to inconsistent detection handling downstream.

  • Limited error pattern coverage
    ERROR_PATTERNS only match numeric codes (e.g., "404") and not common textual indicators ("not found", "server error", "forbidden"). Also regexes lack a case-insensitive flag for textual patterns. This may miss custom error pages that don't show numeric codes in title/h1/h2.

  • Body length measured on raw HTML
    The code measures page size using the raw HTML inside (including tags). This can overcount small-but-meaningful pages (many tags) or undercount text-only error pages. Use text-only length (strip tags) when deciding "very small" pages.

  • Early return side-effects
    The function returns early on detection and skips subsequent logic (caching, experienceTracker updates, telemetry or persisted logs). Consider whether some minimal telemetry or caching should still occur for detected error states so they're visible in diagnostics and history.

  • Missing error context
    The early-return message only includes the detected error type but not what triggered the detection (regex match, title/h1/h2, empty body, size threshold, or AI fallback). Without the detection reason it's harder to debug false positives and to surface actionable information to users or telemetry.

@codeant-ai
Copy link

codeant-ai bot commented Feb 6, 2026

CodeAnt AI finished reviewing your PR.

DavertMik and others added 2 commits February 6, 2026 06:03
- Simplify return type to boolean (no type field needed)
- Require error context for numeric codes (e.g., "404 error", "error 500")
  to prevent false positives like "Room 404" or "Order #500"
- Add text-based patterns: "Page Not Found", "Server Error", "Access Denied"
- Add comprehensive unit tests (41 tests covering detection and false positives)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Just match standard HTTP error strings like "404 Not Found",
"500 Internal Server Error" - no regex guessing.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@DavertMik DavertMik merged commit 4268f8b into main Feb 7, 2026
0 of 4 checks passed
@DavertMik DavertMik deleted the feature/error-page-detection branch February 7, 2026 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:M This PR changes 30-99 lines, ignoring generated files

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant