Skip to content

Implement parser support#51

Merged
titzer merged 14 commits into
mainfrom
parser_impl
Mar 3, 2026
Merged

Implement parser support#51
titzer merged 14 commits into
mainfrom
parser_impl

Conversation

@titzer
Copy link
Copy Markdown
Contributor

@titzer titzer commented Dec 8, 2025

This implements parser support and (I think) the intended semantics that custom page sizes that are 0 or not powers of two are malformed, and powers of two that are not 1 or 65536 are invalid.

@rossberg PTAL. There is a shift/reduce conflict here due to the factoring of the pagetype/memorytype having to do with the syntactic sugar for data segments. I think it requires left-factoring the grammar and rearranging it a little; I wasn't sure if you a convention you were following for how that's done.

Copy link
Copy Markdown
Member

@rossberg rossberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this sort of parser conflict arises whenever you allow a parenthesised phrase (like pagetype here) to be empty that can be followed by another parenthesised clause. Because of the paren token, the parser would need a look-ahead of 2 to know what case it's in.

The only way I know to resolve this in an LALR(1) grammar is by not using an empty production but instead duplicating the use sites of the symbol, once with and once without the extra phrase. That's what I did in other such cases anyway.

A couple of high-level comments:

  • Given that the binary format for pagetype is in logarithmic representation, the AST should probably reflect that. That's what we do for alignment annotations as well. Then some weird cases (like PageT 0) are avoided by construction.

  • Please expand tabs.

Comment thread interpreter/valid/valid.ml Outdated
Comment thread interpreter/text/parser.mly Outdated
Comment thread interpreter/text/parser.mly Outdated
Comment thread interpreter/text/parser.mly Outdated

pagetype :
| LPAR PAGESIZE NAT RPAR
{ let v' =
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can simplify this using the nat32 helper and Lib.Int.is_power_of_two, cf. the action for the align production.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using power_of_two now, will rework for nat32.

@titzer
Copy link
Copy Markdown
Contributor Author

titzer commented Dec 9, 2025

Yeah, this sort of parser conflict arises whenever you allow a parenthesised phrase (like pagetype here) to be empty that can be followed by another parenthesised clause. Because of the paren token, the parser would need a look-ahead of 2 to know what case it's in.

The only way I know to resolve this in an LALR(1) grammar is by not using an empty production but instead duplicating the use sites of the symbol, once with and once without the extra phrase. That's what I did in other such cases anyway.

Thanks, there was a quick fix.

A couple of high-level comments:

* Given that the binary format for pagetype is in logarithmic representation, the AST should probably reflect that. That's what we do for alignment annotations as well. Then some weird cases (like PageT 0) are avoided by construction.

I had considered this at first but started off using the size as an int. I'll rework it to use a nat32 of the logarithm.

* Please expand tabs.

Done.

@titzer
Copy link
Copy Markdown
Contributor Author

titzer commented Dec 25, 2025

I've rebased the parser implementation on the logarithmic representation in #53

@titzer
Copy link
Copy Markdown
Contributor Author

titzer commented Jan 6, 2026

@rossberg PTAL

Copy link
Copy Markdown
Member

@rossberg rossberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay.

Comment thread interpreter/runtime/memory.ml Outdated
Comment thread interpreter/runtime/memory.ml Outdated
Comment thread interpreter/text/parser.mly Outdated
Comment thread interpreter/text/parser.mly Outdated
Comment thread interpreter/text/parser.mly
Comment thread interpreter/text/parser.mly Outdated
Copy link
Copy Markdown
Member

@rossberg rossberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, please never force-push to a PR branch, since that breaks central functionality of GH's review UI, such as New Changes view or comment history.

Comment thread interpreter/runtime/memory.ml Outdated
Comment thread interpreter/util/lib.ml
Comment thread interpreter/text/parser.mly Outdated
Comment thread interpreter/valid/valid.ml Outdated
@titzer
Copy link
Copy Markdown
Contributor Author

titzer commented Mar 2, 2026

@rossberg PTAL, tests pass now without parser regression.

Comment thread interpreter/text/parser.mly
Comment thread interpreter/util/lib.ml Outdated
Comment thread interpreter/text/parser.mly
@titzer titzer merged commit b0de0e9 into main Mar 3, 2026
1 check passed
@titzer titzer deleted the parser_impl branch March 3, 2026 13:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants