-
Notifications
You must be signed in to change notification settings - Fork 1
parsing_block
The block rules are responsible for parsing Markdown syntax encompassing a full or several full lines at once and emitting the tokens required to represent these markdown blocks. For example, block code, blockquotes, headers, hr, etc.
A block rule is a function expecting the following argumnents:
-
state
: an instance ofStateBlock
-
startLine
: the index of the current line -
endLine
: the index of the last available line -
checkMode
: a flag indicating whether we should simply check if the current line marks the begining of the syntax we are trying to parse.
Both startLine
and endLine
refer to values used to index several
informations about lines within the state
.
The checkMode
is here for optimization, if it's set to true
, we can return
true
as soon as we are sure the present line marks the beginning of a block we
are trying to parse (For example, in the case of fenced block code, the check
would be "Is the current line some indentation, the "```" marker and,
optionally, a language name?").
The definition for StateBlock
prototype is in
state_block.js
and its data consists of:
-
src
: the complete string the parser is currently working on -
parser
: The current block parser (here to make nested calls easier) -
env
: a namespaced data key-value store to allow core rules to exchange data -
tokens
: the tokens generated by the parser up to now, you will emit new tokens by callingpush(newToken)
on this -
bMarks
: a collection marking for each line the position of its start insrc
-
eMarks
: a collection marking for each line the position of its end insrc
-
tShift
: a collection marking for each line, how much spaces were used to indent it -
blkIndent
: how much spaces indentation were required by the parent block -
level
: the nested level for the current block
The most important methods are:
-
isEmpty(line)
: checks whether the line at indexline
is empty (or consists solely of blank space) -
skipEmptyLines(from)
: returns the index of the first non-empty afterfrom
-
skiptSpaces(pos)
: returns the next non-blank position at or afterpos
-
skipChars(pos, code)
: returns the next position for a character different thancode
at or afterpos
-
skipCharsBack(pos, code, min)
: returns the previous position for a character different thancode
at or beforepos
, but aftermin
-
getLines(begin, end, indent, keepLastLF)
: returns the text content of the block of lines frombegin
(included) toend
(excluded). For each line, the initial blank spaces will be skipped (up toindent
blank spaces per line). IfkeepLastLF
is set to true, the last character of the excerpt will be a line-feed.
If checkMode
is set to true, simply return a boolean depending on whether the
current line should be considered has the first line of your block. Otherwise,
proceed with the complete parsing.
NB: It is your responsibility to make sure you have reached the maximum nesting
level allowed by comparing state.level
and state.options.maxNesting
.
NB: If for any reason, the block you are trying to parse is incorrectly formated
and you are unable to parse it, you must abort and return false
without
modifying state
in any way.
To completely parse a block, you will need to emit additional tokens
in state.tokens
.
Once you are sure the current line marks the beginning of one of "your" blocks,
you should push an open tag token corresponding to the begining of
your block. You will also need to find its end. Use the state
methods to help
you with this.
Your next decision should be whether you wish to allow other blocks to be nested
in your content or not. If you do, you will need to invoke
state.parser.tokenize(state, startLine, endLine, true)
where state
is
updated accordingly to allow the next batch of rules to run smoothly and
startLine
and endLine
are, respectively, the first and last line of content
of your block.
If you do not wish other to be nested, simply push a new
inline
token with the content of the block. You could use
getLines(begin, end, indent, keepLastLF)
to help you with that.
If your block needs to be divided further, you may push whatever combination of intervening tokens you deem necessary.
The last token you will need to emit is the end tag token of your block.
Finally, you will need to update state
to reflect that the part of the src
covering your block has been taken care of. This means updating state.line
to
the index of the first line following your block. And return true
.