Closed Bug 1779849 Opened 2 years ago Closed 2 years ago

Update irregexp (Jul 2022)

Categories

(Core :: JavaScript Engine, task, P1)

task

Tracking

()

RESOLVED FIXED
105 Branch
Tracking Status
firefox105 --- fixed

People

(Reporter: iain, Assigned: iain)

References

Details

Attachments

(15 files)

48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review

It has been a while since we've pulled in a new version of irregexp, and several useful patches have landed upstream, including one that should fix bug 1775005.

This patch was generated by running import-irregexp.py.

Depends on D152901

V8 added an enum class instead of using raw chars.

Drive-by: Mark CanReadUnaligned as const.

Depends on D152902

Preparing for the next patch

Depends on D152903

IsCharacterInRangeArray is translated from the version in imported/regexp-macro-assembler.h, with more comments and adjustments to make callWithABI work.

Depends on D152904

While adding support for syntax-parsing / early errors, V8 pushed the allocation of the named capture array one level outward, which lets us skip the creation of the FixedArray completely.

The awkward sorting requirement is also present in V8: compare https://github.com/v8/v8/blob/main/src/regexp/regexp.cc#L507-L508.

Depends on D152905

While refactoring this code, upstream V8 removed their own need to have Isolate as a friend class of RegExpStack. We need to be able to reach inside RegExpStack for a couple of reasons, so we use ExternalReference, which is still a friend class (to support ExternalReference::TopOfRegExpStack, which is used in V8's macroassembler code).

Since ExternalReference already exists, I'm also using it for the memory accounting code.

Depends on D152906

Depends on D152907

Another change made to support early errors in V8.

Depends on D152908

A bunch of code was moved upstream from the v8::internal namespace into the v8::base namespace. Adjust accordingly.

Depends on D152909

Depends on D152910

Zone is the V8 equivalent of LifoAlloc. A zone-allocated SmallVec is used in regexp-parser.cc.

Depends on D152911

V8 rewrote some stack overflow code to fix their equivalent of bug 1775005. These changes support that fix.

Depends on D152912

Depends on D152913

Backed out 14 changesets (Bug 1779849) for causing bustages on regexp-parser.cc.
Backout link
Push with failures <--> Bbc
Failure Log

Flags: needinfo?(iireland)

In some but not all clang builds, for reasons I haven't been able to fully pin down, the existing patch stack fails to build with an error about trying to access a private constructor. An over-simplified version of the code:

In regexp-parser.h (V8 code):

// A class with a templated static method.
class RegExpParser {
  template <class CharT>
  static bool VerifyRegExpSyntax(..., const CharT* input, ...);
};

In regexp-parser.cc (V8 code):

// A templated class
template <typename CharT>
class RegExpParserImpl {
 private:
   RegExpParserImpl(const CharT* ...);

 // A friend declaration for the templated static method
 friend bool RegExpParser::VerifyRegExpSyntax<CharT>(..., const CharT*, ...);
}

// Two instantiations of the templated class
template class RexExpParserImpl<uint8_t>
template class RexExpParserImpl<base::uc16>

// The implementation of the templated static method
template <class CharT>
bool RegExpParser::VerifyRegExpSyntax(..., const CharT*, ...) {
  // A call to the private constructor, which should be allowed
  // because of the friend declaration
  RexExpParserImpl( ...)
}

// Two instantiations of the templated static method
template bool RegExpParser::VerifyRegExpSyntax<uint8_t>(...);
template bool RegExpParser::VerifyRegExpSyntax<base::uc16>(...);

In RegExpAPI.cpp (our code), we call RegExpParser::VerifyRegExpSyntax. This triggers a compiler error in the build-linux64-base-toolchains-clang/debug job, but not in any other job. I can't figure out what the difference is. I've tried adding explicit casts to guarantee that RegExpAPI.cpp calls VerifyRegExpSyntax with an already instantiated template parameter (uint8_t or base::uc16), but that didn't fix anything.

After several failed iterations, I've fallen back on the approach of changing the friend declaration to friend class v8::internal::RegExpParser, which solves the problem. I haven't upstreamed this patch yet; it's small, hacky, and unsatisfying enough that I'm willing to patch our local copy and see if I can get to the bottom of this mystery next time I update irregexp.

Pushed by iireland@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/51cfbf07529b
Update import-irregexp.py r=mgaudet
https://hg.mozilla.org/integration/autoland/rev/0e09bbc94f04
Re-import irregexp r=mgaudet
https://hg.mozilla.org/integration/autoland/rev/f390b51c4e67
Use enum class in CheckSpecialCharacterClass r=mgaudet
https://hg.mozilla.org/integration/autoland/rev/72e453021744
Refactor ByteArray r=mgaudet
https://hg.mozilla.org/integration/autoland/rev/a33a389e957a
Implement CheckCharacterInRangeArray r=mgaudet
https://hg.mozilla.org/integration/autoland/rev/3f5b77469ad0
Refactor InitializeNamedCaptures r=mgaudet
https://hg.mozilla.org/integration/autoland/rev/58a6369ecf67
Update RegExpStack code r=mgaudet
https://hg.mozilla.org/integration/autoland/rev/282f8562675c
Update RegExpFlags r=mgaudet
https://hg.mozilla.org/integration/autoland/rev/a1954be3f4f9
Remove FlatStringReader r=mgaudet
https://hg.mozilla.org/integration/autoland/rev/3b1c86a28842
Move definitions into base namespace r=mgaudet
https://hg.mozilla.org/integration/autoland/rev/77243ba8b574
Miscellaneous shim changes r=mgaudet
https://hg.mozilla.org/integration/autoland/rev/13be26ada116
Support zone allocation for SmallVec r=mgaudet
https://hg.mozilla.org/integration/autoland/rev/b622378d5a70
Update stack overflow detection r=mgaudet
https://hg.mozilla.org/integration/autoland/rev/9812c5b56a78
Remove dead shim code r=mgaudet
https://hg.mozilla.org/integration/autoland/rev/72cd1237f80d
Fix build issue r=mgaudet
Flags: needinfo?(iireland)
Regressions: 1783555
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: