Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error: unpaired UTF-8 bidirectional control characters detected #51

Closed
20urc3 opened this issue Sep 12, 2024 · 5 comments
Closed

error: unpaired UTF-8 bidirectional control characters detected #51

20urc3 opened this issue Sep 12, 2024 · 5 comments
Labels
bug Something isn't working

Comments

@20urc3
Copy link
Contributor

20urc3 commented Sep 12, 2024

It seems that we have the same issue with encoding UTF8 characters, in f1_c_fuzz.c. ( When trying to compile a js grammar )

f1_c_fuzz.c: In function ‘gen_node_NAUGHTYSTRING’:
f1_c_fuzz.c:13366:53: error: unpaired UTF-8 bidirectional control characters detected [-Werror=bidi-chars=]
13366 |     subnode = node_create_with_val(NODE_TERM__, "<U+202B>test<U+202B>", 6);
      |                                                  ~~~~~~~~    ~~~~~~~~
      |                                                  |           |       |
      |                                                  |           |       end of bidirectional context
      |                                                  |           U+202B (RIGHT-TO-LEFT EMBEDDING)
      |                                                  U+202B (RIGHT-TO-LEFT EMBEDDING)
f1_c_fuzz.c:13376:53: error: unpaired UTF-8 bidirectional control characters detected [-Werror=bidi-chars=]
13376 |     subnode = node_create_with_val(NODE_TERM__, "<U+2066>test<U+2067>", 6);
      |                                                  ~~~~~~~~    ~~~~~~~~
      |                                                  |           |       |
      |                                                  |           |       end of bidirectional context
      |                                                  |           U+2067 (RIGHT-TO-LEFT ISOLATE)
      |                                                  U+2066 (LEFT-TO-RIGHT ISOLATE)
f1_c_fuzz.c:13416:53: error: unpaired UTF-8 bidirectional control characters detected [-Werror=bidi-chars=]
13416 |     subnode = node_create_with_val(NODE_TERM__, "<U+202A><U+202A>test<U+202A>", 7);
      |                                                  ~~~~~~~~~~~~~~~~    ~~~~~~~~
      |                                                  |       |           |       |
      |                                                  |       |           |       end of bidirectional context
      |                                                  |       |           U+202A (LEFT-TO-RIGHT EMBEDDING)
      |                                                  |       U+202A (LEFT-TO-RIGHT EMBEDDING)
      |                                                  U+202A (LEFT-TO-RIGHT EMBEDDING)
f1_c_fuzz.c:13556:57: error: unpaired UTF-8 bidirectional control character detected [-Werror=bidi-chars=]
13556 |     subnode = node_create_with_val(NODE_TERM__, "test<U+2060>test<U+202B>", 10);
      |                                                                  ~~~~~~~~
      |                                                                  |       |
      |                                                                  |       end of bidirectional context
      |                                                                  U+202B (RIGHT-TO-LEFT EMBEDDING)
f1_c_fuzz.c:14116:59: error: unpaired UTF-8 bidirectional control characters detected [-Werror=bidi-chars=]
14116 |   <U+E006E><U+E006F><U+E0070><U+E0071><U+E0072><U+E0073><U+E0074><U+E0075><U+E0076><U+E0077><U+E0078><U+E0079><U+E007A><U+E007B><U+E007C><U+E007D><U+E007E><U+E007F>", 150);
      |
      |                                                                                                                                                                        |
      |                                                                                                                                                                        end of bidirectional context

cc1: all warnings being treated as errors
@h1994st
Copy link
Collaborator

h1994st commented Sep 12, 2024

Hi @20urc3 , would you mind attaching the js grammar file so that I can have a local test?

@h1994st h1994st added the bug Something isn't working label Sep 12, 2024
@20urc3
Copy link
Contributor Author

20urc3 commented Sep 12, 2024

js.json

@20urc3
Copy link
Contributor Author

20urc3 commented Sep 12, 2024

This is generated with the python script nautilus to json. Note that it produces the same error with the default javascript.json

@20urc3
Copy link
Contributor Author

20urc3 commented Sep 12, 2024

To resolve these issues, you have a few options:

Pair the bidirectional control characters properly:
For example, change:
"<U+202B>test<U+202B>" to "<U+202B>test<U+202C>" (U+202C is the PDF - POP DIRECTIONAL FORMATTING)
"<U+2066>test<U+2067>" to "<U+2066>test<U+2069>" (U+2069 is the PDI - POP DIRECTIONAL ISOLATE)
If these unpaired characters are intentional for testing purposes, you could disable the specific warning:
Add the compiler flag: -Wno-error=bidi-chars or -Wno-bidi-chars

I just added the compiler flag and Im now able to compile the libgrammarmutator-js.so

@20urc3 20urc3 mentioned this issue Sep 15, 2024
@20urc3
Copy link
Contributor Author

20urc3 commented Sep 15, 2024

Fixed with #52 #52

@20urc3 20urc3 closed this as completed Sep 15, 2024
@github-project-automation github-project-automation bot moved this from To do to Done in Grammar mutator Sep 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

No branches or pull requests

2 participants