Skip to content

feat: fix \N escape handling (bare \N, \N{U+HHHH}, quantifier)#33

Draft
toddr-bot wants to merge 1 commit intocpan-authors:mainfrom
toddr-bot:koan.toddr.bot/fix-N-escape-handling
Draft

feat: fix \N escape handling (bare \N, \N{U+HHHH}, quantifier)#33
toddr-bot wants to merge 1 commit intocpan-authors:mainfrom
toddr-bot:koan.toddr.bot/fix-N-escape-handling

Conversation

@toddr-bot
Copy link
Copy Markdown
Collaborator

@toddr-bot toddr-bot commented Mar 31, 2026

What

Fix three bugs in \N escape sequence handling to match modern Perl behavior.

Why

  • Bare \N (meaning "not newline", Perl 5.12+) was rejected with an error instead of being parsed
  • \N{U+0041} produced "Argument isn't numeric in chr" warnings because charnames::vianame() returns the character (not code point) for U+HHHH format
  • \N{3,5} was misinterpreted as a named character lookup instead of \N + quantifier {3,5}

How

  • New nonnewline node type for bare \N, modeled after lnbreak
  • nchar() detects U+HHHH format and uses chr(hex()) directly
  • \N handler checks if braced content matches quantifier pattern before consuming it as a name
  • Bare \N inside character classes still errors, matching Perl's behavior

Testing

All 1203 tests pass (46 new). New t/18nonnewline.t covers bare \N, \N{NAME}, \N{U+HHHH}, quantifier disambiguation, round-trips, and error cases. Existing round-trip and error tests updated.

🤖 Generated with Claude Code


Quality Report

Changes: 7 files changed, 251 insertions(+), 9 deletions(-)

Code scan: clean

Tests: passed (OK)

Branch hygiene: clean

Generated by Kōan post-mission quality pipeline

…iguation

Three bugs fixed in \N handling:

1. Bare \N now parses as "not newline" (Perl 5.12+) instead of erroring.
   Creates a nonnewline node type. Still errors inside character classes,
   matching Perl's behavior.

2. \N{U+HHHH} no longer produces "isn't numeric in chr" warnings.
   The nchar() method now detects U+HHHH format and uses chr(hex())
   directly instead of passing through charnames::vianame() which
   returns the character (not code point) for this format.

3. \N{3,5} is now correctly parsed as \N + quantifier {3,5}, not as
   a named character lookup for "3,5". The handler checks if braced
   content looks like a quantifier pattern before consuming it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant