EatsAtLeastFromLoopEntry is incorrect for quantifiers containing lookahead assertions
Categories
(Core :: JavaScript Engine: JIT, defect, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr78 | --- | wontfix |
firefox84 | --- | wontfix |
firefox85 | --- | wontfix |
firefox86 | --- | fix-optional |
People
(Reporter: gkw, Unassigned)
References
(Regression)
Details
(Keywords: regression, testcase)
print("zzzzz".match(RegExp("(z(?=\u0100|[^]){1}){2}", "g")));
AR=ar sh ./configure --enable-debug --with-ccache --enable-gczeal --enable-debug-symbols --disable-tests
Tested on m-c rev 266e441e2f16.
$ ./js-dbg-64-linux-x86_64-266e441e2f16 --fuzzing-safe --no-threads --baseline-eager --no-ion testcase.js
zz,zz
$ ./js-dbg-64-linux-x86_64-266e441e2f16 --fuzzing-safe --no-threads --no-baseline --no-ion testcase.js
zz
The first bad revision is:
changeset: https://hg.mozilla.org/mozilla-central/rev/e365cbeabc4f
user: Iain Ireland
date: Wed May 13 16:24:04 2020 +0000
summary: Bug 1634135: Turn new regexp engine on by default in Nightly r=mgaudet
Unsure how bad this is, so marking s-s and setting needinfo? from :iain as a start.
Comment 1•3 years ago
|
||
I can reproduce the bug in my local (very out-of-date) copy of V8, so it's a bug in irregexp, not in our embedding of it.
Looks like a mismatch between the regex interpreter and the regex compiler. Probably just a correctness issue, not a security issue, but I haven't looked at it closely yet.
Updated•3 years ago
|
Comment 2•3 years ago
|
||
irregexp tracks the minimum number of characters needed to match a regexp node, to enable early failure. (For example, /(a|bb){6}/
must match at least 6 characters, so we can fail immediately if the input string is 5 characters or fewer.) There is a bug in the calculation when a quantifier applies to a lookahead: z(?=.)
needs two characters to match, but (z(?=.)){2}
only needs 3, because the first character for the second iteration can be the lookahead character from the first. In EatsAtLeastFromLoopEntry, we instead multiply eats_at_least for the loop body by the minimum number of iterations, which causes us to fail too early.
This is a correctness issue in both V8 and SM, but is not a security bug.
Opened a bug against V8: https://bugs.chromium.org/p/v8/issues/detail?id=11290
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Reporter | ||
Comment 3•2 months ago
|
||
This issue seems fixed in latest m-c rev 5c6884a349b6.
Iain, I'm guessing this was fixed in the irregexp update in bug 1703740?
Comment 4•2 months ago
|
||
Yep!
Description
•