Closed Bug 1684347 Opened 3 years ago Closed 2 months ago

EatsAtLeastFromLoopEntry is incorrect for quantifiers containing lookahead assertions

Categories

(Core :: JavaScript Engine: JIT, defect, P2)

All
Linux
defect

Tracking

()

RESOLVED DUPLICATE of bug 1703740
Tracking Status
firefox-esr78 --- wontfix
firefox84 --- wontfix
firefox85 --- wontfix
firefox86 --- fix-optional

People

(Reporter: gkw, Unassigned)

References

(Regression)

Details

(Keywords: regression, testcase)

print("zzzzz".match(RegExp("(z(?=\u0100|[^]){1}){2}", "g")));

AR=ar sh ./configure --enable-debug --with-ccache --enable-gczeal --enable-debug-symbols --disable-tests

Tested on m-c rev 266e441e2f16.

$ ./js-dbg-64-linux-x86_64-266e441e2f16 --fuzzing-safe --no-threads --baseline-eager --no-ion testcase.js 
zz,zz

$ ./js-dbg-64-linux-x86_64-266e441e2f16 --fuzzing-safe --no-threads --no-baseline --no-ion testcase.js 
zz
The first bad revision is:
changeset:   https://hg.mozilla.org/mozilla-central/rev/e365cbeabc4f
user:        Iain Ireland
date:        Wed May 13 16:24:04 2020 +0000
summary:     Bug 1634135: Turn new regexp engine on by default in Nightly r=mgaudet

Unsure how bad this is, so marking s-s and setting needinfo? from :iain as a start.

Flags: sec-bounty?
Flags: needinfo?(iireland)

I can reproduce the bug in my local (very out-of-date) copy of V8, so it's a bug in irregexp, not in our embedding of it.

Looks like a mismatch between the regex interpreter and the regex compiler. Probably just a correctness issue, not a security issue, but I haven't looked at it closely yet.

Flags: needinfo?(iireland)
Group: core-security → javascript-core-security

irregexp tracks the minimum number of characters needed to match a regexp node, to enable early failure. (For example, /(a|bb){6}/ must match at least 6 characters, so we can fail immediately if the input string is 5 characters or fewer.) There is a bug in the calculation when a quantifier applies to a lookahead: z(?=.) needs two characters to match, but (z(?=.)){2} only needs 3, because the first character for the second iteration can be the lookahead character from the first. In EatsAtLeastFromLoopEntry, we instead multiply eats_at_least for the loop body by the minimum number of iterations, which causes us to fail too early.

This is a correctness issue in both V8 and SM, but is not a security bug.

Opened a bug against V8: https://bugs.chromium.org/p/v8/issues/detail?id=11290

Group: javascript-core-security
Summary: Issue involving RegExp and baseline JIT → EatsAtLeastFromLoopEntry is incorrect for quantifiers containing lookahead assertions
Severity: -- → S3
Priority: -- → P2
Flags: sec-bounty? → sec-bounty-
Has Regression Range: --- → yes

This issue seems fixed in latest m-c rev 5c6884a349b6.

Iain, I'm guessing this was fixed in the irregexp update in bug 1703740?

Flags: needinfo?(iireland)

Yep!

Status: NEW → RESOLVED
Closed: 2 months ago
Duplicate of bug: 1703740
Flags: needinfo?(iireland)
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.