-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🚧 Update parser to support look-ahead of two (or more) tokens #9
base: master
Are you sure you want to change the base?
Conversation
NU5118 error: File 'FILE1' is not added because the package already contains file 'FILE2'
@springcomp Below are my replies to your review/comments:
First of all, thanks for the quick turnaround on the feedback.
You found an incomplete implementation and this PR is still in draft. I think the overall direction is there, but it needs some polish and tests. It would also help to get your feedback on whether it helps your case. I wanted to optimise for the common case of look-aheads of one or two tokens. For the less common one, I plan to implement one based on a regular and heap-allocated stack.
Glad you like it!
It's not limited. I am just not done. Again, you discovered my PR that was still work-in-progress.
I think I will only optimise for peeking up to two tokens. For more complex and rare grammars, I'll probably just go with a regular stack but that's a decision that can be changed with time. The point is that the design is adaptive.
Yes, exactly! You pay for the rare cases, but not for the common ones. The only new cost for the common cases is a virtual dispatch for the stack operations. |
Codecov Report
@@ Coverage Diff @@
## master #9 +/- ##
==========================================
+ Coverage 88.76% 92.96% +4.20%
==========================================
Files 1 1
Lines 89 199 +110
==========================================
+ Hits 79 185 +106
- Misses 10 14 +4
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
It totally does! In fact, I revisited my parser to make sure it did not require more than two tokens, this time. And I managed to do it. I suspect we will never need more than two, but you answered my question. Gratt will support anything. You anwsered all my questions with the PR summary and follow up text. Using an adaptive strategy is a great design. I’m still getting used to C#’s pattern matching feature which evolved a lot with different versions of the language. That and the heavily generic nature of the library make it sometimes hard to exactly comprehend what’s going on. But I was able to debug step by step and see for myself the switch from a I’m looking forward to consuming the future NuGet once released. On another note, after extensive work on redesigning the original JMESPath parser using Gratt and a hand-crafter automaton-based regex lexer, I could find no improvements in performance 😞. I also started a Anyway, I learned so much by doing this that it will have been not for nothing. |
@springcomp So see 25abe36 for the resolution where I added a stack-backed strategy when peeking into 3+ tokens. |
I can sympathize with that! I like to solve problems once, so it naturally leads to a more general design and generic code. In the case of Gratt, the parsing is mostly algorithmic so it can be made independent of the types and since there are many types involved in the data model (precedence, token, token kind, result, etc), it does make definitions long to read. What I've found is that aliases can help: Gratt/tests/CSharpPreprocessorExpression.cs Lines 24 to 26 in 65b8da4
Sorry to hear about that. Unfortunately, I do not have the cycles right now to help there. I hope it's something simple and you find it soon. I can, however, commit to this PR and releasing a new version of Gratt in the coming weeks, as I find small bits of time. |
This PR addresses the feature described in issue #7. It updates the parser to use an adaptive strategy depending on whether the parser is asked to look-ahead one, two or more tokens. This is done by abstracting over the storage used for stack operations to push and pop tokens and then using an optimal storage for grammars that require just look-ahead of one or two tokens. This avoids using a single and general storage class such as an array as well as paying the penalty of a heap allocation for the most common cases. Like it is now, for a single peek, the parser only uses a single field that's an optional token (or optuple). For peeking up to, it uses a triplet of count and two tokens. Beyond that, it moves to using a
Stack<>
.