Update irregexp (Feb 2021)
Categories
(Core :: JavaScript Engine, task, P1)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox88 | --- | fixed |
People
(Reporter: iain, Assigned: iain)
References
(Blocks 1 open bug)
Details
Attachments
(5 files)
In the most recent TC39 meeting, the capture indices proposal for regular expressions was updated. The update addressed our concerns about performance by gating the creation of capture indices behind a flag.
Now that TC39 has approved a fix to the performance concerns that prevented us from implementing capture indices, I've started implementing that proposal. Although it's not strictly required, because capture indices are implemented outside of irregexp itself, this seems like as good a time as any to pull in the latest version of irregexp.
The set of noteworthy changes is short. Most of the work in irregexp since our last update has been to implement the experimental non-backtracking engine, which is not yet mature enough for us to import. Aside from some whole-engine refactoring, the main changes have been a smattering of small patches to be more explicit about integer conversions.
| Assignee | ||
Comment 1•4 years ago
|
||
V8 added a new metadata file that we have no need to import.
| Assignee | ||
Comment 2•4 years ago
|
||
This patch is the result of running import-irregexp.py.
Depends on D106963
| Assignee | ||
Comment 3•4 years ago
|
||
Zone is the V8 equivalent of SM's LifoAlloc. The API was changed to enable data collection about allocations. We don't need the data collection, but we have to update our Zone shim.
Depends on D106964
| Assignee | ||
Comment 4•4 years ago
|
||
A variety of small updates. Most notable:
-
V8 added support for a safepoint mechanism where concurrent threads can pause for GC. This means that garbage collection can be triggered without heap allocation, so
DisallowHeapAllocationwas replaced withDisallowGarbageCollectionin most places (including all the ones we care about). -
CompareCharsEqual was added to make string comparison more efficient in code that is only testing for equality and doesn't have to worry about memcmp giving the wrong ordering for two-byte chars on little-endian systems. The implementation is copy-pasted directly from V8.
-
Some code was rewritten upstream to tighten up integer conversions. As part of that change, uc32 (which represents a Unicode char) is now unsigned. (The maximum valid codepoint in Unicode is 0x10FFFF, so signed vs unsigned doesn't generally matter in practice.)
Depends on D106965
| Assignee | ||
Comment 5•4 years ago
|
||
-Werror=type-limits emits an error if a comparison is vacuously true. Upstream irregexp does not compile with -Werror=type-limits, so occasionally irregexp will, for example, assert that an unsigned variable is >= 0. It's not worth upstreaming a patch to remove the trivial assertion every time a case sneaks in; instead, we can just suppress the error in irregexp/moz.build.
Depends on D106966
Comment 7•4 years ago
|
||
| bugherder | ||
https://hg.mozilla.org/mozilla-central/rev/5c49a1fc3ed9
https://hg.mozilla.org/mozilla-central/rev/391b05257d66
https://hg.mozilla.org/mozilla-central/rev/6e6fbe10aa30
https://hg.mozilla.org/mozilla-central/rev/271750e0dd2e
https://hg.mozilla.org/mozilla-central/rev/23f91d432267
Description
•