Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Civet #873

Open
bbrk24 opened this issue Nov 27, 2024 · 13 comments
Open

Add Civet #873

bbrk24 opened this issue Nov 27, 2024 · 13 comments

Comments

@bbrk24
Copy link

bbrk24 commented Nov 27, 2024

If you want an unsupported language added, provide:

  • language name: Civet
  • file extension(s): .civet, .cvt, .cvtx
  • method(s) of commenting text: Civet is a bit complicated here. Civet has two comment modes, with one triggered by a string directive at the top of the file -- think like JS "use strict";.
    By default, // is line comments and /* ... */ is block comments, like in JS. /// may start a line comment only at the top level of the file; inside an indented block, it serves as a multiline regex delimiter.
    In coffeeComment mode, which is triggered by "civet coffeeComment" or "civet coffeeCompat", the above comment styles don't work, and instead line comments are marked with #. Note that JS-style comments and "use strict" may occur before this directive (but no other code can).
    In both modes, ### ... ### is a valid block comment.

If it weren't for the CoffeeScript compatibility mode, this would be fairly easy for me to add myself.

@AlDanial
Copy link
Owner

AlDanial commented Dec 1, 2024

Since both comment styles may exist in the same file, what, if anything, ends coffeeComment mode?

@bbrk24
Copy link
Author

bbrk24 commented Dec 1, 2024

coffeeComment mode lasts until the end of the file. There is no way to turn it back off.

@AlDanial
Copy link
Owner

I'll need to write a custom parser for Civet. Please point me to some representative Civet files that I can work with.

@bbrk24
Copy link
Author

bbrk24 commented Dec 14, 2024

The Civet compiler itself is written in Civet, but I don't know how representative that is. I also have a few Civet projects, including Template Qdeql and my MarioKart build optimizer, but neither of those have coffeeCompat mode enabled.

For a representative coffeeCompat file, you can probably take any CoffeeScript file and add

'civet coffeeCompat'

or

// @ts-nocheck
'civet coffeeCompat'

to the top of it.

@AlDanial
Copy link
Owner

One last question: must the triple slashes and triple pounds exist as a distinct set of three? In other words if a file contains ############# or ///////////// will I need to do mod 3 math to figure out what comment state I'm in? Or are the markers strictly ### and /// bounded by other characters?

@bbrk24
Copy link
Author

bbrk24 commented Dec 17, 2024

/// as the first three characters in a line is a regular // comment that happens to start with a slash, and so you can have as many slashes as you want at the start of a line. Triple slashes anywhere else are regex literal markers, but they can't be empty -- x := ////// fails to parse.

### block comments also cannot be empty -- ###### is this.length.length.length.length.length.length -- but they can be on the same line, so ### ### is a valid block comment. An unpaired ### outside of coffeeComment mode is technically legal and does not indicate a comment at all, but given that it compiles to this.length.length.length I don't think it'll ever be seen in practice.

AlDanial added a commit that referenced this issue Dec 20, 2024
@AlDanial
Copy link
Owner

Give 0563ada a try. Many corner cases seem possible; I didn't test them all.

@bbrk24
Copy link
Author

bbrk24 commented Dec 20, 2024

    if ($coffeeComment) {
        @step_4 = remove_matches($ra_lines, '^\s*#');
    } else {
        @step_1 = call_regexp_common($ra_lines, 'C');
        @step_2 = remove_matches(        \@step_1, '^///');
        @step_3 = remove_matches(        \@step_2, '^\s*//[^/]');
        @step_4 = remove_between_general(\@step_3, '###', '###');
    }

This looks like it'll only detect ### ... ### outside of coffeeComment mode, correct? ### comments are available in both modes. However, outside of coffeeComment mode, /* ... */ is also a valid block comment. If interleaved, whichever comes first wins, so /* ### */ is a comment containing ###, and ### /* ### is a comment containing /*. (### */ ### is also a valid block comment containing */.)

@AlDanial
Copy link
Owner

Right, I added ### ... ### block comment handling to coffeeComment mode. A pair of good test files, one for each mode, would go a long way to shake out the logic. I doubt my files tests/inputs/parser_[12].civet are sufficient.

@bbrk24
Copy link
Author

bbrk24 commented Dec 26, 2024

Here's some edge cases for block comments for you. All of the following files should parse successfully; all instances of foo are commented out, and all instances of bar are not commented:

###
foo
/*
foo
###
bar
bar
###
bar
/*
foo
###
foo
*/
bar
###
foo
###
bar
###
foo
/*
foo
###
bar
###
foo
*/
foo
###
bar
/*
foo
/*
foo
*/
bar

These will all also behave the same if you replace any line break with a space. However, if you remove some whitespace, things change:

bar###bar###
bar ###foo###
bar/*foo*/
bar###bar/*foo###foo*/

@AlDanial
Copy link
Owner

The 3rd and 4th examples,

/*
foo
###
foo
*/
bar
###
foo
###
bar

and

###
foo
/*
foo
###
bar
###
foo
*/
foo
###
bar

contradict each other: if ### pairs are handled first we end up with

/*
foo
foo
###
bar

for the 3rd example but if /* .. */ are removed first, the 4th example misses the middle bar.

@bbrk24
Copy link
Author

bbrk24 commented Dec 28, 2024

Yep, they aren't handled one after the other, they're handled simultaneously in source order.

@AlDanial
Copy link
Owner

"Simultaneously in source order..." that's a neat trick, but not one I'm going to try to implement. I'm hoping the existing implementation meets the "good enough" bar so I can close this out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants