[meta] Sync irregexp with upstream
Categories
(Core :: JavaScript Engine, enhancement, P1)
Tracking
()
People
(Reporter: shu, Unassigned)
References
(Blocks 1 open bug)
Details
(Keywords: meta)
Attachments
(1 file)
370.92 KB,
application/x-zip-compressed
|
Details |
Updated•8 years ago
|
Comment 1•8 years ago
|
||
Comment 2•8 years ago
|
||
Comment 3•8 years ago
|
||
Comment 4•8 years ago
|
||
Comment 5•8 years ago
|
||
Comment 6•8 years ago
|
||
Comment 7•8 years ago
|
||
![]() |
||
Updated•8 years ago
|
![]() |
||
Updated•8 years ago
|
Updated•7 years ago
|
![]() |
||
Comment 8•7 years ago
|
||
![]() |
||
Updated•7 years ago
|
Comment 9•7 years ago
|
||
![]() |
||
Updated•7 years ago
|
![]() |
||
Updated•7 years ago
|
Updated•7 years ago
|
Comment 11•6 years ago
|
||
I assume the patches are outdated again. What blocked you from looking into them?
Sebastian
Comment 12•6 years ago
|
||
(In reply to Sebastian Zartner [:sebo] from comment #11)
I assume the patches are outdated again. What blocked you from looking into them?
Other tasks which are more pleasantly to work at compared to this one. :-)
Comment 13•6 years ago
•
|
||
Actually, the last code merge to irregexp was on nov/2017, so I believe this patch may still apply...
Comment 14•6 years ago
|
||
This is the only bug blocking the last missing 4 ECMAScript 2018 features. It would be nice to have it for Firefox 68, which is an ESR version.
Comment 15•6 years ago
|
||
Given it's a week til the branch-off for 68 and we're into a soft freeze and we don't have a full patch for this regardless, I'm pretty confident this isn't going to make 68, nice as it might be if it did.
Comment 16•6 years ago
|
||
How about 69?
Comment 17•6 years ago
|
||
We already have 70, so no.
Comment 18•6 years ago
|
||
Could we at least undo the formatting so that diffing can be easier?
Comment 19•6 years ago
|
||
Something needs to happen here, we can't just pretend this isn't important. Steven any ideas?
Comment 20•6 years ago
|
||
evilpie, thanks for bringing this up again, it is important. We are revisiting where this fits into our priorities and looking to how to possible resource this.
Comment 21•6 years ago
|
||
I am working on a fresh import of V8's regexp code into SM. I'll commandeer this bug as a meta bug for tracking that work.
To reduce the risk of getting caught in the same position in the future, I'm making an effort to keep the V8 source as close to upstream as possible. Ideally, we could avoid making any non-trivial changes at all. (Example of a trivial change: removing #includes that point to V8 header files.) To make this work, I've been creating new regexp-shim.h and regexp-masm-shim.h headers, which define all the V8-isms that are used inside the regexp implementation.
Many of the shims still need implementations, but I've reached the point where the entire regexp engine compiles in SM and then fails when linking in missing implementations. Specifically, everything in v8/src/regexp, with the exception of:
- regexp.cc and regexp-utils.cc, which implement the interface between the regexp implementation and the rest of the engine. They are more tightly coupled to the rest of V8, and it is more effective to just rewrite them from scratch.
- s390/* and ppc/* contain macroassembler code for platforms we don't support
- I have only finished stubbing out the masm code for x64. x86, arm, and (eventually) MIPS will come later.
- Additional work is necessary to enable V8_INTL_SUPPORT (V8's equivalent to ENABLE_INTL_API)
As partial validation of this approach, I recently pulled in a month of V8 changes, including the addition of an entire peephole optimization pass, and it took less than an hour of additional shim work. Obviously nothing is certain until we have it working, but I am hopeful that this will be a more sustainable approach than the status quo.
Comment 22•6 years ago
|
||
Assuming that SM will also get RegExp Match Indices (Bug 1519483) as part of this effort.
Comment 23•6 years ago
|
||
iain, did you consider the opposite approach of creating a stronger separation between irregexp and V8 upstream? Irregexp would ideally be V8-agnostic. The embedder (in this case SpiderMonkey) would of course need to implement all required macro-assembler calls.
mconca, match indices are regexp builtins work and do not require irregexp changes.
Comment 24•6 years ago
|
||
Jakob: When Luke pinged the V8 team about this, the answer he got was that V8 is understandably not very interested in moving their regexp code into a separate repo, but that refactoring APIs between irregexp and the JSVM / code generator would be possible.
In some cases it will be faster to make upstream changes than to implement things ourselves: for example, a few changes to NativeRegExpMacroAssembler::Match and NativeRegExpMacroAssembler::StringCharacterPosition could significantly reduce the string API that SpiderMonkey has to shim.
In other cases, the necessary refactoring is unlikely to be feasible in a reasonable timeframe, so we're better off with a shim. For example, it's faster for us to provide APIs that mimic V8's memory allocation than it would be to refactor all the memory allocation code inside irregexp.
I have a pile of notes on this stuff. At some point in the near future, once my ducks are better aligned, I'll reach out to the V8 team to talk it over. If V8 is interested in making irregexp V8-agnostic, then -- good news! I have a list of the places where it currently isn't! :)
Comment 25•6 years ago
|
||
I think this is something we should pursue. Ideally, updating irregexp in SM and other embedders should be as easy as plopping a current copy of irregexp source code into their project without extra effort.
I agree that e.g. replacing V8's Zone allocations with a normal malloc/free model (or something else) is not something we want, but it should be doable with reasonable effort to abstract the Zone model into an allocator interface (with just New, no Delete) plus a set of data structures based on this allocator.
This work would not be trivial by any means, but I assume it's preferable for you to do this once and then have a fairly small and well-defined set of functionality to implement within SM, rather than having to shim a large part of V8 that you perpetually need to update. I'll try to reach out to you next week, maybe we can figure something out :)
Comment 26•6 years ago
|
||
(In reply to Jakob Gruber from comment #23)
mconca, match indices are regexp builtins work and do not require irregexp changes.
Thanks, Jakob.
Comment 27•6 years ago
|
||
Jakob: yes, let's talk this week. I am on Toronto time.
Comment hidden (obsolete) |
Comment 29•6 years ago
|
||
I may have underestimate the changes required, sorry. There's no way to delete postings here (I've requested it in the Bugzilla forum).
Comment 30•6 years ago
|
||
Iain is actively landing code changes for this right now, see bug 1592302 and bug 1592307 for example. I know it can be frustrating to wait, but we're doing what we can to get this fixed as soon as possible.
Comment 31•5 years ago
|
||
All depending on/blocking bugs are now fixed.
Is it time to close?
Updated•5 years ago
|
Comment 32•5 years ago
|
||
Would I be correct in assuming that bugs such as https://bugs.chromium.org/p/v8/issues/detail?id=11616 will get automatically merged into Firefox once they are fixed upstream?
Comment 33•5 years ago
|
||
We merge upstream fixes manually, because they may require updates to our shim code, but the process is mostly automated. See import-irregexp.py.
I will keep an eye on that bug. Thanks for the pointer!
Description
•