Optimize charCodeAt/charAt with out-of-bounds indices
Categories
(Core :: JavaScript Engine: JIT, enhancement, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox109 | --- | fixed |
People
(Reporter: anba, Assigned: anba)
References
(Blocks 2 open bugs)
Details
Attachments
(1 file)
charCodeAt/charAt are called with out-of-bounds indices in some web-tooling benchmarks in JetStream. Supporting this case saves the following number of VM calls when running JetStream3 (cli version):
-
acorn-wtb
- String.prototype.charCodeAt 1'150'000
-
jshint-wtb
- String.prototype.charAt 2'450'000
- String.prototype.charCodeAt 200'000
-
uglify-js-wtb
- String.prototype.charAt 450'000
In addition to these benchmarks, I've also observed in the wild calls to charCodeAt/charAt with out-of-bound indices, for example on reddit or twitch.
I've tracked calls to native functions which end up here. (Note: The absolute numbers are kind of irrelevant, because they depend on how long the web site is visited. So it's probably better to only compare the relative numbers.)
For example when visiting https://www.twitch.tv/directory/ and then scrolling down a bit, I got the following results:
Top 10 native calls from Baseline:
Name | Number of calls |
---|---|
Object.assign | 300'000 |
Object.keys | 160'000 |
String.prototype.charCodeAt | 100'000 |
Set.prototype.add | 90'000 |
Set | 60'000 |
String.prototype.charAt | 40'000 |
Array.prototype.indexOf | 40'000 |
Number | 40'000 |
String.prototype.trim | 40'000 |
JSON.stringify | 40'000 |
Top 10 native calls from Ion:
Name | Number of calls |
---|---|
String.prototype.charCodeAt | 79'140'000 |
String.fromCharCode | 3'440'000 |
Array.prototype.indexOf | 2'850'000 |
Object.assign | 2'610'000 |
Object.keys | 2'040'000 |
Array.prototype.push | 910'000 |
Number.prototype.toString | 660'000 |
Set.prototype.add | 310'000 |
WeakMap.prototype.get | 300'000 |
Boolean | 190'000 |
charAt/charCodeAt weren't inlined, because either the inputs were nested ropes (bug 1669942) or the indices were out-of-bounds.
Assignee | ||
Comment 1•2 years ago
|
||
After the patch for bug 1669942 there are still some non-inlined calls to
String.prototype.charAt
and String.prototype.charCodeAt
in WBT benchmarks:
acorn-wtb
String.prototype.charCodeAt 1'150'000
jshint-wtb
String.prototype.charAt 2'450'000
String.prototype.charCodeAt 200'000
uglify-js-wtb
String.prototype.charAt 450'000
- "acorn-wtb" calls
String.prototype.charCodeAt
with an index above the string length. - "jshint-wtb" calls both
String.prototype.charCodeAt
andString.prototype.charAt
with an index larger than the string length. - "uglify-js-wtb" calls
String.prototype.charAt
with a too large index and also
negative values.
CacheIR compilation can be easily modified to support out-of-bounds cases. But use
separate MIR nodes for Warp transpilation:
MCharCodeAt
is typed to return an Int32, whereas out-of-bounds accesses can
return either an Int32 or NaN.- The
LoadStringCharResult
CacheIR op is transpiled usingMCharCodeAt
and
MFromCharCode
. But for out-of-bounds accesses we have to return the empty
string. We can't easily modify the existing transpilation approach to return
an empty string, so instead add a separateMCharAtMaybeOutOfBounds
instruction.
Depends on D162726
Updated•2 years ago
|
Comment 3•2 years ago
|
||
bugherder |
Description
•