Closed Bug 1697077 Opened 3 years ago Closed 3 years ago

Add testing function for interrupting RegExp execution

Categories

(Core :: JavaScript Engine: JIT, task, P3)

task

Tracking

()

RESOLVED FIXED
91 Branch
Tracking Status
firefox91 --- fixed

People

(Reporter: iain, Assigned: sploitem, Mentored)

Details

(Keywords: good-first-bug)

Attachments

(1 file)

Background

Irregexp, the regular expression engine used inside SpiderMonkey, supports two different modes of execution: a bytecode interpreter, which uses less memory but is slower, and native JIT compilation, which is faster but uses more memory.

To prevent slow scripts from hanging a browser tab, SpiderMonkey supports interrupts, which can be used as a last resort to force currently executing JS code to stop what it's doing and execute a callback. When an interrupt is requested, we set a flag in the JSContext. That flag is checked in C++ code using CheckForInterrupt. It also has to be checked in JIT-compiled code: for example, compiled regexps check the flag when backtracking.

If an interrupt happens while a regular expression is executing, we abandon the execution and return to C++ code to handle the interrupt. (Doing it this way means that, if the interrupt callback triggers a garbage-collection and moves the string we're matching against, we don't have to worry about updating the regexp code's internal state.) After we've handled the interrupt, we can try again.

This Bug

In bug 1691184, we ran into an issue where our retry code failed. The problem was that the interrupt triggered a major GC that threw away JIT code to reduce memory consumption. Normally this is not a problem, because the regular expression can fall back to the bytecode interpreter. However, we have a separate optimization that jumps straight to compiling regexps if the input string is long enough. After the GC threw away the JIT code, the regexp had no way to execute and failed.

The fix was easy enough: we recompile if necessary after handling the interrupt. However, adding a testcase is somewhat tricky, because we don't have great infrastructure for testing interrupts in compiled code. (Interrupts in C++ can be tested with the InterruptTest testing function.)

The goal in this bug is to add better test infrastructure for interrupting compiled regexps. This has several steps:

  1. Add a debug-only (#ifdef DEBUG) flag word to the Isolate, to indicate whether we want to simulate an interrupt. (In V8, an Isolate is the equivalent of a JSContext in SM. In SM, we implement just the parts of the Isolate interface that are used by irregexp. For the purposes of this bug, it is safe to think of the Isolate as a ~global object that holds data about regexps.)

  2. Add a debug-only testing function in builtin/TestingFunctions.cpp to set the flag word. (Here is an example of a patch that adds a testing function.)

  3. Add debug-only code in SMRegexpMacroAssembler::Backtrack to check the flag and bail out of the regexp if it is set. This will involve using the MacroAssembler interface (masm) to generate native code. The code to check the new flag should look similar to the existing code in that function. There is a brief guide to masm here that may be helpful. If that's not enough, please feel free to ask for more help, either here or in #spidermonkey.

  4. At this point, if you call your testing function to set the flag, then try to execute a regexp that will backtrack (say /a(bc|bd)/.exec("abe")), your code should execute and return early. (You will have to put the regexp in a loop or run with --baseline-eager to ensure that the code is compiled instead of interpreted.) However, the handling code here (https://searchfox.org/mozilla-central/rev/9bf82ef9c097ee6af0e34a1d21c073b2616cc438/js/src/vm/RegExpObject.cpp#653-678) doesn't know that we are trying to simulate an interrupt, and will instead probably hit ReportOverRecursed and throw a stack-overflow error. You will therefore have to add some code here to check whether the flag on the isolate is set, and request an interrupt if it is (using cx->requestInterrupt(InterruptReason::CallbackUrgent)).

  5. Using your code, write a testcase for bug 1691184. The testcase should execute a regular expression with a long (>1000 characters) input string. Before doing so, the following code should set things up so that the interrupt callback will trigger a compacting GC:

var interrupted = false;
gczeal(0);
setInterruptCallback(() => { 
  interrupted = true;
  startgc(7,'shrinking');
  return true;
});

You can use printf debugging / add assertions to verify that the testcase is hitting your code. Once you have a working testcase, you can add it to the test suite by adding it to the jit-test/tests/regexp/ directory. (To make sure that it works in non-debug builds, you may have to add // |jit-test| skip-if: !('<Name of your testing function>' in this) as the first line of the testcase.

When you're done, you should be able to run the test suite (./mach jit-test) without any failures. (You will want to build SpiderMonkey with --enable-optimize while running the tests.)

Prerequisites

Before getting started, you'll want to ensure:

Getting help

Leave comments on this bug for questions, or ask in #spidermonkey on chat.mozilla.org.

Note: this is a more difficult bug than some of our other bugs labelled good-first-bug. Don't feel bad about asking for help!

I would like to work on this bug

Great! This is a more complicated bug, so don't hesitate to ask if you need any clarification.

(I am assuming you are "liacds" in #spidermonkey; correct me if I'm wrong.)

Is anyone contributing on this bug

Lia is working on this. See comment 1.

(In reply to Iain Ireland [:iain] from comment #2)

Great! This is a more complicated bug, so don't hesitate to ask if you need any clarification.

(I am assuming you are "liacds" in #spidermonkey; correct me if I'm wrong.)

Okay, thank you! liacds is me

Assignee: nobody → itisaliya

Is anyone still working on this bug

Lia, are you still working on this?

Flags: needinfo?(itisaliya)

Do I make up a new flag word in the first step?

Flags: needinfo?(itisaliya) → needinfo?(iireland)

By "flag word" here I mean something like this (in the definition of the Isolate class):

#ifdef DEBUG
  bool shouldSimulateInterrupt = false;
#endif
Flags: needinfo?(iireland)
Assignee: itisaliya → nobody

Can i take this one?

Assignee: nobody → sploitem

Added flag shouldSimulateInterrupt to Isolate class;
added bailout code in SMRegExpMacroAssembler::Backtrack;
added a check of shouldSimulateInterrupt in RegExpShared::execute;
added InterruptRegexp to js shell that sets flags shouldSimulateInterrupt and serviceInterrupt, and executes regexp.

Pushed by iireland@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/022a63e31d10
Add testing function for interrupting RegExp execution. r=iain
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 91 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: