Add a cache to fix perf cliffs when atomizing the same string(s) repeatedly
Categories
(Core :: JavaScript Engine, task, P3)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox81 | --- | fixed |
People
(Reporter: jandem, Assigned: jandem)
References
Details
Attachments
(1 file)
Looking at the Reddit issue in bug 1654087, I noticed we were repeatedly atomizing some strings with length 1897. This ends up hashing the same characters over and over and that's pretty slow.
To fix this I have a patch that adds a StringToAtom cache (for strings with length >= 30) that's purged on GC. The cache is also used to de-duplicate nursery strings with the corresponding atom when tenuring, saving some memory and future atomization costs. (The string-deduplication code we have for nursery GC has a length limit, but that doesn't apply to this because we already did the expensive hashing anyway.)
I tested this on some popular websites and the cache has more hits than misses. For the micro-benchmark below, I get:
before: 2480 ms
after: 30 ms
function f() {
var keys = [];
var obj = {};
for (var i = 0; i < 1000; i++) {
var str = String.fromCharCode(i % 256).repeat(1897);
keys.push(str);
obj[str] = i;
}
var t = dateNow();
var res = 0;
for (var i = 0; i < 1000; i++) {
for (var j = 0; j < keys.length; j++) {
res = obj[keys[j]];
}
}
print(dateNow() - t);
print(res);
}
f();
| Assignee | ||
Updated•5 years ago
|
| Assignee | ||
Comment 1•5 years ago
|
||
This improves performance a lot when a long string is atomized more than once. This
is happening on Reddit (strings with length 1897).
The cache is also used to de-duplicate strings during nursery GC before it's purged.
This handles some cases the existing string deduplication doesn't handle and if a
string is in the StringToAtomCache it should be faster.
Updated•5 years ago
|
Comment 3•5 years ago
|
||
| bugherder | ||
Description
•