Skip to content

Commit

Permalink
pcre2_compile: avoid 1 byte buffer overread parsing VERBs
Browse files Browse the repository at this point in the history
As reported recently by ef218fb (Guard against out-of-bounds memory
access when parsing LIMIT_HEAP et al (PCRE2Project#463), 2024-09-07), a malformed
pattern could result in reading 1 byte past its end.

Fix a similar issue that affects all VERBs and add test cases to
ensure the original bug and all its siblings are no longer an issue.

While at it fix the wording of the related documentation.
  • Loading branch information
carenas committed Sep 22, 2024
1 parent cd4c0e3 commit bd0080e
Show file tree
Hide file tree
Showing 5 changed files with 30 additions and 13 deletions.
8 changes: 5 additions & 3 deletions ChangeLog
Original file line number Diff line number Diff line change
Expand Up @@ -82,21 +82,23 @@ pattern.
14. Item 43 of 10.43 was incomplete because it addressed only \z and not \Z,
which was still misbehaving when matching fragments inside invalid UTF strings.

15. Octal escapes of the form \045 or \111 were not being recognized in
15. Octal escapes of the form \045 or \111 were not being recognized in
substitution strings, and if encountered gave an error, though the \o{...} form
was recognized. This bug is now fixed.

16. Merged PR475, which implements title casing in substitution strings a la
16. Merged PR475, which implements title casing in substitution strings a la
Perl.

17. Merged PR478, which disallows \x if not followed by { or a hex digit.

18. Merged PR473, which implements Python-style backrefs in substitutions.

19. Merged PR483, which adding \g<n> and $<name> to replacement strings.
19. Merged PR483, which is adding \g<n> and $<name> to replacement strings.

20. Merged PR470, which adds PCRE2_EXTRA_NO_BS0 and PCRE2_EXTRA_PYTHON_OCTAL.

21. Prevent 1 byte overread when parsing malformed patterns with early VERBs.


Version 10.44 07-June-2024
--------------------------
Expand Down
4 changes: 2 additions & 2 deletions doc/pcre2syntax.3
Original file line number Diff line number Diff line change
Expand Up @@ -408,8 +408,8 @@ only one hyphen. Setting (but no unsetting) is allowed after (?^ for example
example (?i:...).
.P
The following are recognized only at the very start of a pattern or after one
of the newline or \eR options with similar syntax. More than one of them may
appear. For the first three, d is a decimal number.
of the newline or \eR sequences or options with similar syntax. More than one
of them may appear. For the first three, d is a decimal number.
.sp
(*LIMIT_DEPTH=d) set the backtracking limit to d
(*LIMIT_HEAP=d) set the heap size limit to d * 1024 bytes
Expand Down
11 changes: 3 additions & 8 deletions src/pcre2_compile.c
Original file line number Diff line number Diff line change
Expand Up @@ -10404,12 +10404,13 @@ if ((options & PCRE2_LITERAL) == 0)
{
for (i = 0; i < sizeof(pso_list)/sizeof(pso); i++)
{
uint32_t c, pp;
const pso *p = pso_list + i;

if (patlen - skipatstart - 2 >= p->length &&
PRIV(strncmp_c8)(ptr + skipatstart + 2, p->name, p->length) == 0)
{
uint32_t c, pp;

skipatstart += p->length + 2;
switch(p->type)
{
Expand All @@ -10436,18 +10437,12 @@ if ((options & PCRE2_LITERAL) == 0)
case PSO_LIMH:
c = 0;
pp = skipatstart;
if (!IS_DIGIT(ptr[pp]))
{
errorcode = ERR60;
ptr += pp;
goto HAD_EARLY_ERROR;
}
while (pp < patlen && IS_DIGIT(ptr[pp]))
{
if (c > UINT32_MAX / 10 - 1) break; /* Integer overflow */
c = c*10 + (ptr[pp++] - CHAR_0);
}
if (pp >= patlen || ptr[pp] != CHAR_RIGHT_PARENTHESIS)
if (pp >= patlen || pp == skipatstart || ptr[pp] != CHAR_RIGHT_PARENTHESIS)
{
errorcode = ERR60;
ptr += pp;
Expand Down
8 changes: 8 additions & 0 deletions testdata/testinput2
Original file line number Diff line number Diff line change
Expand Up @@ -5261,6 +5261,14 @@ a)"xI

/(*LIMIT_HEAP=0)xxx/I

/(*LIMIT_HEAP=123/use_length

/(*LIMIT_MATCH=/use_length

/(*CRLF)(*LIMIT_DEPTH=/use_length

/(*CRLF)(*LIMIT_RECURSION=1)(*BOGUS/use_length

/\d{0,3}(*:abc)(?C1)xxx/callout_info

# ----------------------------------------------------------------------
Expand Down
12 changes: 12 additions & 0 deletions testdata/testoutput2
Original file line number Diff line number Diff line change
Expand Up @@ -16220,6 +16220,18 @@ First code unit = 'x'
Last code unit = 'x'
Subject length lower bound = 3

/(*LIMIT_HEAP=123/use_length
Failed: error 160 at offset 16: (*VERB) not recognized or malformed

/(*LIMIT_MATCH=/use_length
Failed: error 160 at offset 14: (*VERB) not recognized or malformed

/(*CRLF)(*LIMIT_DEPTH=/use_length
Failed: error 160 at offset 21: (*VERB) not recognized or malformed

/(*CRLF)(*LIMIT_RECURSION=1)(*BOGUS/use_length
Failed: error 160 at offset 34: (*VERB) not recognized or malformed

/\d{0,3}(*:abc)(?C1)xxx/callout_info
Callout 1 x

Expand Down

0 comments on commit bd0080e

Please sign in to comment.