Skip to content

fix: reject empty \p{} and \P{} Unicode property names#47

Draft
toddr-bot wants to merge 1 commit intocpan-authors:mainfrom
toddr-bot:koan.toddr.bot/fix-empty-property
Draft

fix: reject empty \p{} and \P{} Unicode property names#47
toddr-bot wants to merge 1 commit intocpan-authors:mainfrom
toddr-bot:koan.toddr.bot/fix-empty-property

Conversation

@toddr-bot
Copy link
Copy Markdown
Collaborator

What

Parser now rejects \p{}, \P{}, \p{^}, and \p{ } with the existing RPe_EMPTYB error, matching Perl's behavior.

Why

Perl rejects these at compile time with "Empty \p{}" but the parser accepted them silently, allowing invalid patterns through the validation layer.

How

Added a post-extraction check in both \p and \P handlers: after parsing braced content {...}, the property name is validated to be non-empty, not whitespace-only, and not just a ^ prefix. Uses the existing RPe_EMPTYB error code. Works in both regular and [...] character class contexts.

Testing

  • 12 new assertions in t/11errors.t covering \p{}, \P{}, \p{^}, \p{ }, and character class variants
  • All 1173 existing tests pass
  • Valid properties like \p{Lu} and \p{^Lu} continue to work

🤖 Generated with Claude Code

Perl rejects \p{}, \P{}, \p{^}, and \p{ } as "Empty \p{}" at compile
time, but the parser was accepting these patterns silently. Added
validation after braced content extraction to reject empty, caret-only,
and whitespace-only property names using the existing RPe_EMPTYB error
code. Works in both regular and character class contexts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant