Open Bug 1813706 Opened 2 years ago Updated 16 days ago

Investigate if ExecuteRegExp could avoid flattening the rope in some cases

Categories

(Core :: JavaScript Engine, task, P2)

task

Tracking

()

People

(Reporter: smaug, Unassigned)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [sp3])

The profile is from jQuery-TodoMVC
https://share.firefox.dev/3Jt3aKc
Flatting is from https://searchfox.org/mozilla-central/rev/7a9e3bbab8f81c2cbc72a394047f948da9cfef9a/js/src/builtin/RegExp.cpp#1081
Lots of the time is under encoding_*, so maybe avoiding flattenInternal wouldn't help too much, but worth to investigate some more.

720 ropes, of which 401 latin1, and all those latin1 ones are relatively small, length <= 1024. Maybe zone or something could have a pre-allocated 1k buffer to temporary flatten the string? Or maybe something totally different should be done here?

But I haven't even checked in which case we might need to keep the string alive for longer time.

I'll note that once we get into the regex engine, we really do need a linear input string. The engine (which we share with V8) is designed to work on a contiguous array of latin1 or utf16 characters. It would be very hard to change that, and I think it's likely that the increased cost of accessing characters would eat up any wins from not flattening.

Given that the input needs to be flat, maybe it's better to try to avoid flattening a given string more than once. Are we flattening new ropes every time, or the same rope repeatedly? If the latter, maybe we can (somehow?) be smarter about eagerly flattening strings that are passed to ExecuteRegExp.

It looks like this is also slow because we're inflating latin1 to char16_t. Worth checking if the rope actually contains two-byte chars or if we end up with a two-byte child/rope where we could use latin1.

Yeah, there are things to investigate here.
The patch in bug 1808673 is an example where avoiding flattening helps lot, but perhaps that is not the case here. Needs some testing.
Different frameworks may trigger quite different cases. (Like, bug 1808673 helps only React, since that one ends up passing ropes to atomization. Interestingly enough, jQuery passes only linear strings, unlike in this bug, where it is jquery which passes ropes to regex)

Severity: -- → N/A
Priority: -- → P2
Whiteboard: [sp3]
See Also: → 1926474
You need to log in before you can comment on or make changes to this bug.