Skip to content

Parser fix#31

Open
aakash-barthwal wants to merge 1 commit into
mainfrom
new_fix
Open

Parser fix#31
aakash-barthwal wants to merge 1 commit into
mainfrom
new_fix

Conversation

@aakash-barthwal
Copy link
Copy Markdown
Contributor

No description provided.

@aakash-barthwal aakash-barthwal changed the title sig Parser fix Apr 30, 2026
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit d0fed4a. Configure here.

# If the suite timed out or crashed (no summary), the run is incomplete —
# mark everything FAILED so new tests added by the golden patch show as F2P.
if "tests summary: ok:" not in text_output:
results = {k: "FAILED" for k in results}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Override marks passing tests as FAILED, inflating F2P

Medium Severity

When individual test lines are found but the output lacks "tests summary: ok:", the override results = {k: "FAILED" for k in results} marks tests that actually passed as "FAILED". The downstream F2P calculation in evaluator.py then miscounts these as F2P (fail-to-pass) instead of P2P (pass-to-pass), inflating F2P and deflating P2P metrics. The comment's stated rationale ("new tests added by the golden patch show as F2P") doesn't require this override — new tests either don't appear in pre-patch output (caught by new_passing logic) or already fail without the golden patch.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit d0fed4a. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant