Tokenizer - RegexParsers의 Parser로 선언된 sub-parser 들이 공백을 consume하며 다음 토큰까지 매칭하는 현상

# Problem

RegexParsers를 상속하는 Parsers 클래스 내부에서 Parser[_]로 선언된 모든 production rule은 자동으로 whitespace를 "무시"합니다.

참고: [StackOverflow 질문](https://stackoverflow.com/questions/20793058/how-to-skip-whitespace-but-use-it-as-a-token-delimeter-in-a-parser-combinator)

따라서 python spec - 2. lexical analysis에 나온대로 sub-production을 모두 Parser로 선언해버릴 경우, 사이에 공백을  두고 서로 다른 토큰으로 인식되어야 하는 문자열이 공백을 무시하고 하나의 토큰으로 인식됩니다.

예시)
```scala
// Tokenizer.scala
...
lazy val digitPart: Parser[String] = ... // 숫자를 매칭
lazy val exponent: Parser[String] = ...  //  "e" ~ 숫자를 매칭
lazy val exponentFloat: Parser[String] = (pointloat | digitPart) ~ exponent ^^ ... // 3e10과 같은 과학적 표기법을 매칭
```
이 경우 `3e 10` 이라는 문자열이 `NUM(3), NAME(e), NUM(10)`으로 각각 토큰화되어야하지만, 위 구현에서는 `NUM(3e10)`으로 하나의 토큰으로  토큰화됨.

# Solution

위의 예시와 같이 sub-production rule을 Parser로 선언하는 것이 아닌, 해당 룰의 regex 만을 적어놓고 floatNumber와 같은 top-level 만 Parser로 선언해야 함

# Fix
- [x] op, delim : no need
- [x] id,  keyword
- [x] IntegerLiteral
- [x] FloatLiteral
- [x] ImagLiteral
- [x] StringLiteral
- [ ] Refactoring : reusing function for `"(" ~ sth ~ ")"` to make new regex

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tokenizer - RegexParsers의 Parser로 선언된 sub-parser 들이 공백을 consume하며 다음 토큰까지 매칭하는 현상 #1

Problem

Solution

Fix

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Tokenizer - RegexParsers의 Parser로 선언된 sub-parser 들이 공백을 consume하며 다음 토큰까지 매칭하는 현상 #1

Description

Problem

Solution

Fix

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions