Closed Bug 1627356 Opened 4 years ago Closed 4 years ago

Remove MatchOnly regexps

Categories

(Core :: JavaScript Engine, task, P1)

task

Tracking

()

RESOLVED FIXED
mozilla77
Tracking Status
firefox77 --- fixed

People

(Reporter: iain, Assigned: iain)

Details

Attachments

(1 file)

Regexp.prototype.test only returns a boolean, and (usually) does not require match results. Our old implementation of Irregexp has a "MatchOnly" compilation mode (for which we generate a separate piece of JitCode) that attempts to take advantage in two ways:

  1. We can avoid returning the match results at all. In the distant past, it appears this was quite valuable, because our pre-Irregexp engine (YARR) allocated and returned an array. Irregexp fills in a (generally stack-allocated) array with offsets into the input string, so we gain very little by not returning those results. (We have to track the offsets of captured groups anyways, because of back-references.)

  2. Unless a regexp starts with ^ or ends with $, we have to insert a .*? at the beginning/end to allow it to match anywhere in the string. If a regexp already starts with a .*, this can be very slow. We would like to strip the starting .*, but in general this is incorrect (greedy vs non-greedy). For test specifically, though, we can mostly get away with it, which is what we did in bug 1055402. (Here's the key bit of code.) However, there are a bunch of non-standard properties on the RegExp constructor (input, lastParen, $1, etc) that cache the results of the last executed regexp. To handle those, when we test a regexp, we cache the input and pattern strings in the RegExpStatics. If we need to produce the details about a previous match, we compile a non-cheating version of the regular expression from scratch and re-run it.

This is pretty gross. It does make a significant performance difference on /.*b.*/.test("a long string of a's"), but only in the sense that we slow down to roughly the same speed as V8. (Depending on the exact details of the microbenchmark; it seems like they might be slightly better at hoisting constant regexps out of loops.) The new import of irregexp doesn't support the special compilation mode, so if we want to preserve our old behaviour we have to implement it from scratch outside of irregexp. I think it's better to just get rid of it.

Unfortunately, we also use the lazy updating of RegExpStatics when calling regular expressions from Ion, to avoid having to copy the match information after every execution. For now this patch just removes MatchOnly from the RegExpObject/RegExpShared code, and leaves lazy updating in RegExpStatics. (I'm also not bothering to clean up the old irregexp, since it's about to go away.)

In the future, we can get rid of lazy updating of statics by having irregexp write the output directly into the RegExpStatics match buffer, which we probably should have been doing all along. It will be easier to do that after I've landed the new irregexp code.

MatchOnly mode is a hack. Instead of rewriting it for our new irregexp import, I'm just removing it.

Priority: -- → P1
Backout by shindli@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/6d27b732d2db
Backed out changeset 42159f58c6d5 for causing mass failures in RegExpRunStatus js CLOSED TREE

Ugh. Accidentally dropped force from this line in compileIfNecessary, which means that if we are interrupted during regexp execution and try to fall back to the interpreter, we don't actually generate any bytecode.

Running a quick try build to verify that was the problem, then I'll reland this.

Flags: needinfo?(iireland)
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla77
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: