-
-
Notifications
You must be signed in to change notification settings - Fork 398
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Protect included code snippets from "recursive" highlighting #1441
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for contributing! I added a few inline comments; in short, I'm all for this fix but there are a few blockers in the current approach in the form of a breaking change and regression.
Those would need to be resolved before getting this through.
@@ -637,20 +637,25 @@ def parse_lang_for_codeblock(source) | |||
# @param [String] html the html to search for code in | |||
# @return [String] highlighted html | |||
# @see #html_syntax_highlight | |||
def parse_codeblocks(html) | |||
def parse_codeblocks(html, internal=false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit pick here but this should be styled as internal = false
with spaces.
@@ -54,7 +54,7 @@ def urlencode(text) | |||
# @param [Symbol] markup examples are +:markdown+, +:textile+, +:rdoc+. | |||
# To add a custom markup type, see {MarkupHelper} | |||
# @return [String] the HTML | |||
def htmlify(text, markup = options.markup) | |||
def htmlify(text, markup = options.markup, internal: false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately using keyword args limits us to Ruby 3.x and would be a breaking change that could not be merged.
The good news is this could easily be changed to a positional arg.
language ||= object.source_type | ||
|
||
if options.highlight | ||
pre_attrs = Array %(class="code #{language}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an odd syntax. Did you simply mean pre_attrs = %(class="code #{language}")
?
@@ -39,7 +39,7 @@ def html_syntax_highlight_ruby_ripper(source) | |||
end | |||
output | |||
rescue Parser::ParserSyntaxError | |||
source =~ /^<span\s+class=/ ? source : h(source) | |||
h(source) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a subtle but problematic change that I think will cause a regression. This change was explicitly added to permit pre-formatted HTML extra-files to be added via: yard - OTHER.html
where OTHER.html might include:
<pre class="code ruby"><code class="ruby"><span class='kw'>return</span></code></pre>
The output of this section would generate double encoded output if you ran it through this block:
If removing this HTML detection causes a failure in your implementation, you may have to rethink part of your approach.
8eab01f
to
9273d42
Compare
9273d42
to
5d3e04f
Compare
5d3e04f
to
a9babdb
Compare
Thanks for taking a look! I fixed the syntax and style issues you mentioned, and restored the bypass behavior for pre-formatted blocks. I'd misinterpreted what that check was for and thought it was to account for the same recursive/nested case that this branch considers. I brought the check back, but lifted it out of the rescue block and into and into yard/lib/yard/templates/helpers/html_helper.rb Lines 206 to 208 in a9babdb
yard/spec/templates/helpers/html_helper_spec.rb Lines 675 to 679 in a9babdb
Does this look alright? It seems slightly better to not feed non-ruby input to the parser and catch the exception, but the rest of my changes should be compatible either way. |
Description
This PR aims to fix an issue with how yard handles syntax highlighted snippets from
{include:file:...}
source.When an included file contains code blocks, they'll be fed through #parse_codeblocks more than once, as the file and then the parent are processed depth first. After the first highlighting pass, the code block contains additional
<span>
s and escapes (<
etc), and later passes are unable to tell that these weren't part of the original source.yard's built-in ruby highlighter does work in these cases -- its tokenizer tends to raise an exception when the source looks like HTML instead of Ruby, and by inspecting the source it can tell that the block was already highlighted:
Plugins, on the other hand, are unaware that the code snippets they're given in some cases are already marked up. This leads to the final document showing a mess of escaped html within the
<pre>
.[1]In this branch, data-highlighted=true is added to code snippets when they're processed through
{include:...}
, allowing them to be skipped during subsequent passes through #parse_codeblocks. The attribute is then removed by the outermost #htmlify call, so it's only an internal annotation and shouldn't make it into the final output (unless a theme calls #insert_include directly).[1] This can be seen in the output of
Completed Tasks
bundle exec rake
locally (if code is attached to PR).