Closed Bug 1630383 Opened 5 years ago Closed 5 years ago

Irregexp: Miscellaneous fixes

Categories

(Core :: JavaScript Engine, task, P1)

task

Tracking

()

RESOLVED FIXED
mozilla77
Tracking Status
firefox77 --- fixed

People

(Reporter: iain, Assigned: iain)

References

Details

Attachments

(10 files)

When bug 1630090 is finished, all the major pieces of the new import will be in place. There are a few small things left to do before we get a green try build, none of which are big enough to deserve their own bug.

While getting my patches organized for review and addressing comments, I managed to accidentally break a few build configurations on try.

  1. I wasn't building binast locally, so it was just broken.
  2. I missed an underscore in ARM-specific masm code.
  3. The new regexp-error.h file that I imported needs to #include regexp-shim.h to avoid breaking non-unified builds.

I had this right in my original patch stack, but 'fixed' it before putting the patches up for review. The old engine only needs to initialize the regexp stack for the main thread, because it's only used when executing regexps, but we need the isolate for off-thread parsing, so it needs to be initialized even for helper threads.

Depends on D71352

A couple of cleanups that have been floating around my patch stack for a while.

Depends on D71353

For a variety of reasons, some better than others, we call into irregexp with unrooted GC pointers. (For example, the input string is unrooted, at least in part because rooting it would complicate Ion.) Everything is properly guarded with AutoCheckCannotGC, which means that we need to convince the hazard analysis that nothing untoward is going on.

There are three obstacles to this:

  1. ICU uses a pile of function pointers internally. To keep things interesting, they change the names of their functions each release (so ICU 66 has "icu_66::UnicodeSet::applyFilter()" and "uprv_malloc_66", but a newer version would update the number.) I handled these in annotations.js by adding an isICU check and throwing regexps at the problem.

  2. The call to the generated code happens via function pointer, so we need to get the hazard analysis to ignore it. I used the solution from the old import (adding an AutoSuppressGCAnalysis). (See ExecuteRaw in RegExpAPI.cpp.)

  3. The RegExpCompileData struct that V8 uses to communicate information between the parser and the compiler has one member that is an Object (~= JS::Value) and another than is a Handle. Because of this, the hazard analysis also wants to make sure that we don't GC during compilation. This is mostly not a problem, except for one place where we call a merge sort with a function pointer as comparator. I added SortConsecutiveAtoms to the whitelist to fix this. (See https://searchfox.org/mozilla-central/rev/97cb0a90bd053de87cd1ab7646d5565809166bb1/js/src/new-regexp/regexp-compiler-tonode.cc#498)

Depends on D71354

The V8 code in the new engine uses implicit constructors with wild abandon, and it doesn't seem like a good use of time to try upstreaming a patch to remove them.

Depends on D71355

We can compile regexps that we know ahead of time will not match. (For example, compiling a regexp containing a two-byte character to match latin-1 strings.)

Depends on D71356

When changing from inputStartMinusOne to inputStart, I got the boundary condition wrong for backwards-looking CheckPosition. This broke some test262 regexp tests.

Depends on D71357

In each iteration of the inner loop, we compare two characters. If they are not the same, but the first character is a letter, then we convert both characters to lower case and see if they are the same.

At least, that's what we intend to do. We are accidentally loading a second copy of the first letter instead. This means that any two letters will compare equal in a case-insensitive backreference.

Computers are hard.

Depends on D71358

In preparation for the next patch, which changes the internal representation of v8::Object, this patch removes all direct access (in derived classes) to the value_ member of v8::Object. Instead, we now use accessor methods.

This patch also changes a few places that were using JS::Value as a parameter to take const JS::Value& instead.

Depends on D71359

The natural way to represent v8::Object is as a thin wrapper around a JS::Value. However, Value is MOZ_NON_PARAM and alignas(8), and irregexp uses objects as parameters. There are two plausible solutions: stop marking Value as MOZ_NON_PARAM, or stop storing a Value inside Object. I wrote a set of patches to do the former, but nbp pointed out in the bug I created that "the intent of alignas(8) was to ensure that 'doubles' are well aligned when loaded/stored as floating points values."

If we want to keep the annotations on Value, then we have to change Object instead. I tried rewriting it to store raw bits, and the patches seem alright, so I'm going with that unless somebody has a good reason not to.

Depends on D71360

Pushed by iireland@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/fe9af68999c4 Fix broken builds r=mgaudet https://hg.mozilla.org/integration/autoland/rev/4c0cbbdaa240 Fix off-thread parsing r=mgaudet https://hg.mozilla.org/integration/autoland/rev/026a6f7e37ef Shim simplifications r=mgaudet https://hg.mozilla.org/integration/autoland/rev/0c7019227612 Appease hazard analysis r=sfink https://hg.mozilla.org/integration/autoland/rev/22127e9ef811 Add irregexp to ignore list for implicit constructor static analysis r=sfink https://hg.mozilla.org/integration/autoland/rev/82240b022085 Make successHandler optional r=jandem https://hg.mozilla.org/integration/autoland/rev/a468a05ee7b5 Fix off-by-one error in CheckPosition r=jandem https://hg.mozilla.org/integration/autoland/rev/b4bd410fbcf6 Fix CheckNotBackReferenceIgnoreCase r=jandem https://hg.mozilla.org/integration/autoland/rev/ba9f11886936 Remove direct access to wrapped Value in v8::Object r=tcampbell https://hg.mozilla.org/integration/autoland/rev/abd746d74377 Store Value inside Object as raw bits r=tcampbell
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: