-
Notifications
You must be signed in to change notification settings - Fork 57
Add 'alwaysMatchEndPattern' option to end patterns #90
Conversation
This option, when set to true, will force the end pattern to match, even when it is not the current rule.
@50Wliu Status update? |
Probably not going to merge. This was always experimental but then I realized merging this would also break compatibility with Linguist and VSCode which expect TextMate-compatible grammars. |
So how does that relate to #83? |
This comment has been minimized.
This comment has been minimized.
@50Wliu Is it possible to make the change on Atom's end instead? Or does Atom just use the exact code from here? VS Code seems to be able to do it, but I haven't looked at their implementation. AFAIK, they parse with their own system anyway. Compatibility is all well and good, but it would really help to have something that unblocks all those issues. There will be many language packages that don't adopt Tree-sitter, so this is still a valid concern. Also, what compatibility would be broken? Existing languages will not have the introduced property, and the behaviour should not change if it is not present. So wouldn't it be backwards compatible? |
@Aerijo Atom's grammars are also used outside Atom, examples being VS Code and GitHub, so by introducing new constructs, we are breaking compatibility with these two, although admittedly both can be adapted to work with it. Ideally we'd want to stay compatible with the classical TextMate implementation so that anyone who follows that, could benefit from the grammars that the community produces. It is interesting that you have gotten it to work in VS Code, I'd be interested in how did you lay out your grammar to get it working there? |
@Ingramz I didn't do anything, that's just what happens in VS Code to a grammar that would break in Atom. They appear to use a different engine(?) to apply the grammar, as there have been differences in behaviour before (to much confusion). Like I said, I haven't actually looked at what they do internally, I've just seen what it does in the editor. As for the compatability, I know other things use this. That's why I pointed out the change is backwards compatible. My preference would really be to emulate VS Code, because that seems like the desired default behaviour, but I was worried this might break the C / C++ / C# stuff. |
@Aerijo they do indeed use a different engine. Historically VS Code's implementation of the TextMate grammars has been more accurate than Atom's first-mate, but there might be some differences in It looks like the VSCode grammar for markdown is different. If that grammar works the intended way also in TextMate but not in Atom, then first-mate needs correcting again. |
@Ingramz Unfortunately, it seems TextMate leaks too I suppose the VS Code engine authors saw this as a bug, not a feature of TextMate grammars. |
@Aerijo if that is the case, then we need to find how it has been worked around. Because if they haven't documented how their implementation differs from TextMate, then I would consider it a bug in VS Code's implementation. |
I am struggling with this exact same grammar limitation. I am trying to write a grammar that matches JavaScript embedded in HTML via <% if (encoding === 'utf8') { %>
<meta charset="utf-8">
<% } %> I have tried the following rule (and variations thereof), but because of the unclosed {
'begin': '<%'
'end': '%>'
'name': 'meta.embedded.foo'
'contentName': 'source.js.embedded.foo'
'patterns': [
'include': 'source.js'
]
} Is there any way to work around this limitation with existing First Mate grammar semantics (at least in Atom)? Do Tree Sitter grammars provide a solution to this problem? |
Is there a status update on this? |
The status update was given to you when you first asked: #90 (comment) Details as to the why behind that reason have been given in further comments. The correct solution here is to continue work on moving remaining TextMate grammars to a Tree-sitter implementation that doesn't suffer from this limitation. |
I stopped working on this a while back as it became clear that tree-sitter would not have this issue and that it would eventually become the preferred way to write grammars for Atom. Additionally, this would break compatibility with other TextMate-like engines (for example, Linguist) which would be less-than-desirable. For those reasons, I'm going to be closing this pull request as I don't anticipate completing it. |
@50Wliu Thanks for the heads up. @Arcanemagus I asked in case things changed in the 14 months since (that's not a short amount of time in the world of software development), since the PR was still open. 14 months ago when that comment was made, tree-sitter didn't even exist, at least publicly. Please don't assume I'm just trying to non-constructively spam the thread with that question - I was literally just wanting to know whether this route was still to be taken, since it would guide my approach to suggestions on issues related to it. |
I know this is probably irrelevant at this point. But for anyone who is curious, the VS Code markdown tmLanguage uses a The while loop still has it's limitations, which VS Code is running up against right now and I'm actually looking into creating this kind of a PR for a 'alwaysMatchEndPattern' in VS Code. The Tree Sitter is certainly a better general solution. |
🚨 WIP 🚨
This option, when set to true, will force the end pattern to match, even when it is not the current rule. This is incredibly useful for grammars that rely on includes, yet do not necessarily need to wait for the include to finish matching before ending the include. Some examples of such grammars include: language-gfm's code blocks, language-html and language-xml including language-javascript and language-coffee-script, and language-python allowing SQL queries in strings.
TODO:
getEndPatternScanner
Fixes #83
and unblocks a whole lot of issues:
atom/language-php#187
atom/language-shellscript#60
atom/language-gfm#171
atom/language-gfm#21
atom/language-yaml#79
atom/language-html#90
atom/language-python#110
atom/language-python#55
atom/language-python#39
atom/language-python#143
and more...