Skip to content

Commit

Permalink
Fix misbehaviour of pcre2_match() and pcre2_dfa_match() when PCRE2_FI…
Browse files Browse the repository at this point in the history
…RSTLINE was set for an anchored pattern.
  • Loading branch information
PhilipHazel committed Nov 11, 2023
1 parent 88b1c47 commit 52041d8
Show file tree
Hide file tree
Showing 11 changed files with 78 additions and 13 deletions.
4 changes: 4 additions & 0 deletions ChangeLog
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,10 @@ above because \b and \B are defined in terms of \w.
option, and (?aP) also sets (?aT) so that (?-aP) disables all ASCII
restrictions on POSIX classes.

37. If PCRE2_FIRSTLINE was set on an anchored pattern, pcre2_match() and
pcre2_dfa_match() misbehaved. PCRE2_FIRSTLINE is now ignored for anchored
patterns.


Version 10.42 11-December-2022
------------------------------
Expand Down
6 changes: 3 additions & 3 deletions doc/html/pcre2api.html
Original file line number Diff line number Diff line change
Expand Up @@ -1686,7 +1686,7 @@ <h1>pcre2api man page</h1>
PCRE2_USE_OFFSET_LIMIT, which provides a more general limiting facility. If
PCRE2_FIRSTLINE is set with an offset limit, a match must occur in the first
line and also within the offset limit. In other words, whichever limit comes
first is used.
first is used. This option has no effect for anchored patterns.
<pre>
PCRE2_LITERAL
</pre>
Expand Down Expand Up @@ -2021,7 +2021,7 @@ <h1>pcre2api man page</h1>
</pre>
This option forces all the POSIX character classes, including [:digit:] and
[:xdigit:], to match only ASCII characters, even when PCRE2_UCP is set. It can
be changed within a pattern by means of the (?aP) option setting, but note that
be changed within a pattern by means of the (?aP) option setting, but note that
this also sets PCRE2_EXTRA_ASCII_DIGIT in order to ensure that (?-aP) unsets
all ASCII restrictions for POSIX classes.
<pre>
Expand Down Expand Up @@ -4140,7 +4140,7 @@ <h1>pcre2api man page</h1>
</P>
<br><a name="SEC43" href="#TOC1">REVISION</a><br>
<P>
Last updated: 12 October 2023
Last updated: 11 November 2023
<br>
Copyright &copy; 1997-2023 University of Cambridge.
<br>
Expand Down
7 changes: 4 additions & 3 deletions doc/pcre2.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1653,7 +1653,8 @@ COMPILING A PATTERN
greater than 3. See also PCRE2_USE_OFFSET_LIMIT, which provides a more
general limiting facility. If PCRE2_FIRSTLINE is set with an offset
limit, a match must occur in the first line and also within the offset
limit. In other words, whichever limit comes first is used.
limit. In other words, whichever limit comes first is used. This option
has no effect for anchored patterns.

PCRE2_LITERAL

Expand Down Expand Up @@ -3975,11 +3976,11 @@ AUTHOR

REVISION

Last updated: 12 October 2023
Last updated: 11 November 2023
Copyright (c) 1997-2023 University of Cambridge.


PCRE2 10.43 12 October 2023 PCRE2API(3)
PCRE2 10.43 11 November 2023 PCRE2API(3)
------------------------------------------------------------------------------


Expand Down
8 changes: 4 additions & 4 deletions doc/pcre2api.3
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
.TH PCRE2API 3 "12 October 2023" "PCRE2 10.43"
.TH PCRE2API 3 "11 November 2023" "PCRE2 10.43"
.SH NAME
PCRE2 - Perl-compatible regular expressions (revised API)
.sp
Expand Down Expand Up @@ -1628,7 +1628,7 @@ PCRE2_FIRSTLINE if \fIstartoffset\fP is greater than 3. See also
PCRE2_USE_OFFSET_LIMIT, which provides a more general limiting facility. If
PCRE2_FIRSTLINE is set with an offset limit, a match must occur in the first
line and also within the offset limit. In other words, whichever limit comes
first is used.
first is used. This option has no effect for anchored patterns.
.sp
PCRE2_LITERAL
.sp
Expand Down Expand Up @@ -1979,7 +1979,7 @@ a pattern by means of the (?aT) option setting.
.sp
This option forces all the POSIX character classes, including [:digit:] and
[:xdigit:], to match only ASCII characters, even when PCRE2_UCP is set. It can
be changed within a pattern by means of the (?aP) option setting, but note that
be changed within a pattern by means of the (?aP) option setting, but note that
this also sets PCRE2_EXTRA_ASCII_DIGIT in order to ensure that (?-aP) unsets
all ASCII restrictions for POSIX classes.
.sp
Expand Down Expand Up @@ -4148,6 +4148,6 @@ Cambridge, England.
.rs
.sp
.nf
Last updated: 12 October 2023
Last updated: 11 November 2023
Copyright (c) 1997-2023 University of Cambridge.
.fi
2 changes: 1 addition & 1 deletion doc/pcre2demo.3
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
.TH PCRE2DEMO 3 "12 October 2023" "PCRE2 10.43-DEV"
.TH PCRE2DEMO 3 "11 November 2023" "PCRE2 10.43-DEV"
.\"AUTOMATICALLY GENERATED BY PrepareRelease - do not EDIT!
.SH NAME
// - A demonstration C program for PCRE2 - //
Expand Down
2 changes: 1 addition & 1 deletion src/pcre2_dfa_match.c
Original file line number Diff line number Diff line change
Expand Up @@ -3443,7 +3443,7 @@ anchored = (options & (PCRE2_ANCHORED|PCRE2_DFA_RESTART)) != 0 ||
where to start. */

startline = (re->flags & PCRE2_STARTLINE) != 0;
firstline = (re->overall_options & PCRE2_FIRSTLINE) != 0;
firstline = !anchored && (re->overall_options & PCRE2_FIRSTLINE) != 0;
bumpalong_limit = end_subject;

/* Initialize and set up the fixed fields in the callout block, with a pointer
Expand Down
2 changes: 1 addition & 1 deletion src/pcre2_match.c
Original file line number Diff line number Diff line change
Expand Up @@ -6836,7 +6836,7 @@ if (mcontext == NULL)
else mb->memctl = mcontext->memctl;

anchored = ((re->overall_options | options) & PCRE2_ANCHORED) != 0;
firstline = (re->overall_options & PCRE2_FIRSTLINE) != 0;
firstline = !anchored && (re->overall_options & PCRE2_FIRSTLINE) != 0;
startline = (re->flags & PCRE2_STARTLINE) != 0;
bumpalong_limit = (mcontext->offset_limit == PCRE2_UNSET)?
true_end_subject : subject + mcontext->offset_limit;
Expand Down
13 changes: 13 additions & 0 deletions testdata/testinput2
Original file line number Diff line number Diff line change
Expand Up @@ -6028,4 +6028,17 @@ a)"xI

# --------

/
/anchored, firstline
\x0a

/
/anchored,firstline,no_start_optimize
\x0a

/
/firstline
\x0a
abc\x0adef

# End of testinput2
13 changes: 13 additions & 0 deletions testdata/testinput6
Original file line number Diff line number Diff line change
Expand Up @@ -5026,4 +5026,17 @@
/c*+/
ab\=ph,offset=2

/
/anchored, firstline
\x0a

/
/anchored,firstline,no_start_optimize
\x0a

/
/firstline
\x0a
abc\x0adef

# End of testinput6
17 changes: 17 additions & 0 deletions testdata/testoutput2
Original file line number Diff line number Diff line change
Expand Up @@ -17901,6 +17901,23 @@ No match

# --------

/
/anchored, firstline
\x0a
0: \x0a

/
/anchored,firstline,no_start_optimize
\x0a
0: \x0a

/
/firstline
\x0a
0: \x0a
abc\x0adef
0: \x0a

# End of testinput2
Error -70: PCRE2_ERROR_BADDATA (unknown error number)
Error -62: bad serialized data
Expand Down
17 changes: 17 additions & 0 deletions testdata/testoutput6
Original file line number Diff line number Diff line change
Expand Up @@ -7895,4 +7895,21 @@ Partial match:
ab\=ph,offset=2
Partial match:

/
/anchored, firstline
\x0a
0: \x0a

/
/anchored,firstline,no_start_optimize
\x0a
0: \x0a

/
/firstline
\x0a
0: \x0a
abc\x0adef
0: \x0a

# End of testinput6

0 comments on commit 52041d8

Please sign in to comment.