Closed Bug 1350760 Opened 8 years ago Closed 8 years ago

Atomizing strings needs to be faster

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla55

Tracking Flags:

Tracking

Status

firefox55

---

fixed

People

(Reporter: ehsan.akhgari, Assigned: jandem)

References

Details

Attachments

(6 files)

Part 1 - Optimize AtomMarkingRuntime::markAtom 8 years ago Jan de Mooij [:jandem] 21.74 KB, patch	jonco : review+	Details \| Diff \| Splinter Review
Part 2 - Inline SparseBitmap::setBit 8 years ago Jan de Mooij [:jandem] 3.49 KB, patch	sfink : review+	Details \| Diff \| Splinter Review
Part 3 - Add AtomMarkingRuntime::inlinedMarkAtom 8 years ago Jan de Mooij [:jandem] 8.14 KB, patch	jonco : review+	Details \| Diff \| Splinter Review
Part 4 - Move JSString::validateLength call 8 years ago Jan de Mooij [:jandem] 1.83 KB, patch	luke : review+	Details \| Diff \| Splinter Review
Part 5 - Inline hash table lookups 8 years ago Jan de Mooij [:jandem] 12.89 KB, patch	luke : review+	Details \| Diff \| Splinter Review
Part 6 - Add a Zone cache 8 years ago Jan de Mooij [:jandem] 9.14 KB, patch	jonco : review+	Details \| Diff \| Splinter Review

(no longer active)

Reporter

Description

•

8 years ago

I've seen this come up in various profiles for different use cases, and one recent example in session restore with a lot of tabs. See the profile in bug 1312373. Also, https://perfht.ml/2nqpMAu. Can we somehow avoid locking here, for example by using some kind of lock free hash table to store the atoms?

(no longer active)

Reporter

Updated

•

8 years ago

Blocks: 875125

Jan de Mooij [:jandem]

Assignee

Comment 1

•

8 years ago

If I'm reading that profile correctly, there are many atomize calls under XPC_WN_GetterSetter -> ... -> nsXPCWrappedJSClass::CallMethod -> JS_GetProperty. It's unfortunate nsXPCWrappedJSClass::CallMethod has a |const char*| and doesn't receive/store a jsid or JSAtom*. Also, is this class something we could convert to new DOM bindings?

Jan de Mooij [:jandem]

Assignee

Comment 2

•

8 years ago

I'm also seeing this in profiles and have a stack of patches to optimize various things here.

Assignee: nobody → jdemooij

Status: NEW → ASSIGNED

Jan de Mooij [:jandem]

Assignee

Updated

•

8 years ago

Summary: js::Atomize() needs to be faster → Atomizing strings needs to be faster

Jan de Mooij [:jandem]

Assignee

Comment 3

•

8 years ago

Attached patch Part 1 - Optimize AtomMarkingRuntime::markAtom — Details — Splinter Review

The atom marking code takes a TenuredCell* argument and has to check whether it's an atom/string, is non-null, etc. It's faster (and cleaner IMO) to templatize this so we can pass JSAtom* or JS::Symbol*. Then we can assert the thing is in the atoms zone instead of checking it at runtime, we no longer have to check the trace kind, etc. I also pushed the null checks into the callers because all hot callers pass a non-null pointer.

Attachment #8852457 - Flags: review?(jcoppeard)

Jan de Mooij [:jandem]

Assignee

Comment 4

•

8 years ago

Attached patch Part 2 - Inline SparseBitmap::setBit — Details — Splinter Review

A profile of my atomization micro-benchmark shows we spend at least 5% under SparseBitmap::setBit and SparseBitmap::getOrCreateBlock. This patch inlines setBit and the fast path of getOrCreateBlock.

Attachment #8852461 - Flags: review?(sphink)

Jon Coppeard (:jonco)

Comment 5

•

8 years ago

Comment on attachment 8852457 [details] [diff] [review] Part 1 - Optimize AtomMarkingRuntime::markAtom Review of attachment 8852457 [details] [diff] [review]: ----------------------------------------------------------------- Nice! ::: js/src/gc/AtomMarking.cpp @@ +270,4 @@ > return true; > > + if (mozilla::IsSame<T, JSAtom>::value) { > + JSAtom* atom = reinterpret_cast<JSAtom*>(thing); I guess we don't need this cast.

Attachment #8852457 - Flags: review?(jcoppeard) → review+

Jan de Mooij [:jandem]

Assignee

Comment 6

•

8 years ago

Attached patch Part 3 - Add AtomMarkingRuntime::inlinedMarkAtom — Details — Splinter Review

I don't think we should inline AtomMarkingRuntime::markAtom everywhere, but IMO the atomization code is hot enough that it's worth it there. This patch adds gc/AtomMarkingRuntime-inl.h with AtomMarkingRuntime::inlinedMarkAtom, and calls inlinedMarkAtom in jsatom.cpp. It improves my atomization micro-benchmark (see below) from ~1250 ms to ~1220 ms (> 1300 ms without parts 1 and 2) so it seems worth it to eliminate most of the remaining markAtom overhead. --- function f() { var o = Object.create(null); var res = 0; var t = new Date; for (var i=0; i<10000000; i++) { res = o["foo" + (i % 2048)]; } print(new Date - t); return res; } f();

Attachment #8852490 - Flags: review?(jcoppeard)

Jan de Mooij [:jandem]

Assignee

Comment 7

•

8 years ago

Attached patch Part 4 - Move JSString::validateLength call — Details — Splinter Review

The first thing we do in Atomize and AtomizeChars is call JSString::validateLength, but we only have to do this when we allocate a new atom. The common case is looking up an existing atom and that will only succeed if the length is sane.

Attachment #8852518 - Flags: review?(luke)

Jan de Mooij [:jandem]

Assignee

Comment 8

•

8 years ago

Attached patch Part 5 - Inline hash table lookups — Details — Splinter Review

Clang was not inlining some (HashSet) methods. MOZ_ALWAYS_INLINE fixes this and improves the time from ~1220 ms to ~1186 ms. These patches combined make atomization >10% faster on OS X. We now inline everything under AtomizeString and I don't think there's much low-hanging fruit left. It might make sense to add a per-Zone cache (purged on GC) so we can bypass markAtom, the multiple hash table lookups, and the atoms lock. This would need some careful measurements because it could easily backfire by adding more overhead when we're atomizing many different strings.

Attachment #8852527 - Flags: review?(luke)

Luke Wagner [:luke]

Updated

•

8 years ago

Attachment #8852518 - Flags: review?(luke) → review+

Steve Fink [:sfink] [:s:]

Updated

•

8 years ago

Attachment #8852461 - Flags: review?(sphink) → review+

Jon Coppeard (:jonco)

Updated

•

8 years ago

Attachment #8852490 - Flags: review?(jcoppeard) → review+

Jan de Mooij [:jandem]

Assignee

Comment 9

•

8 years ago

(In reply to Jan de Mooij [:jandem] from comment #8) > It might make sense to add a per-Zone cache (purged on GC) so we can bypass > markAtom, the multiple hash table lookups, and the atoms lock. This would > need some careful measurements because it could easily backfire by adding > more overhead when we're atomizing many different strings. I prototyped this per-Zone cache (purged on GC) and it's actually a much bigger win than I expected - it improves my atomization micro-benchmark (comment 6) from 1190 ms to 860 ms, the Kraken JSON-parsing test becomes a few ms faster and Octane-codeload seems to improve with a few hundred points (although it's noisy). We can likely improve more by inlining this part of the atomization code. On Octane the per-zone cache has a hit rate of 96.4%. Tomorrow I'll test this on some popular websites to see what we get there. Also, if we do this we can probably drop part 3.

Jan de Mooij [:jandem]

Assignee

Updated

•

8 years ago

Blocks: 1334672

Jan de Mooij [:jandem]

Assignee

Comment 10

•

8 years ago

Attached patch Part 6 - Add a Zone cache — Details — Splinter Review

This adds the cache to Zone and purges it on GC. I tested it on some websites (Gmail, Google Docs/Sheets, Wikipedia, Twitter) and the hit rate is at least 80%. It's more effective on certain benchmarks/frameworks and I think Ehsan's profile (see comment 0 and comment 1) would also have benefited a lot from this because XPConnect keeps atomizing the same strings. I also added a clearAndShrink() method to js::HashMap and js::HashSet. Often clear() is not great because it doesn't realloc, so you risk wasting memory, and finish() is also not great because you have to do a fallible initialization again and I'd like to avoid adding a branch to the atomization code to check initialized(). Looks green on Try so far.

Attachment #8852833 - Flags: review?(jcoppeard)

Jon Coppeard (:jonco)

Updated

•

8 years ago

Attachment #8852833 - Flags: review?(jcoppeard) → review+

Luke Wagner [:luke]

Updated

•

8 years ago

Attachment #8852527 - Flags: review?(luke) → review+

Pulsebot

Comment 11

•

8 years ago

Pushed by jandemooij@gmail.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/5c7b6e016f85 part 1 - Templatize and optimize AtomMarkingRuntime::markAtom. r=jonco https://hg.mozilla.org/integration/mozilla-inbound/rev/4d0df04fefbe part 2 - Ensure SparseBitmap::setBit gets inlined. r=sfink https://hg.mozilla.org/integration/mozilla-inbound/rev/1c682e6c1eb0 part 3 - Add AtomMarkingRuntime::inlinedMarkAtom to eliminate markAtom call overhead when atomizing. r=jonco https://hg.mozilla.org/integration/mozilla-inbound/rev/f3a6587bc94a part 4 - Call JSString::validateLength only when we have to allocate a new atom. r=luke https://hg.mozilla.org/integration/mozilla-inbound/rev/9ff8c5acf4ea part 5 - Make sure various hashtable lookups get inlined when atomizing strings. r=luke https://hg.mozilla.org/integration/mozilla-inbound/rev/17c436c27035 part 6 - Add a Zone cache for recently atomized strings. r=jonco

Jan de Mooij [:jandem]

Assignee

Comment 12

•

8 years ago

(In reply to Jon Coppeard (:jonco) from comment #5) > I guess we don't need this cast. Unfortunately we do because T is a template parameter that can be JSAtom or JS::Symbol. Even though we check mozilla::IsSame<T, JSAtom> the compiler still wants the cast.

Jan de Mooij [:jandem]

Assignee

Comment 13

•

8 years ago

(In reply to Jan de Mooij [:jandem] from comment #9) > the Kraken JSON-parsing test becomes a few ms faster AWFY is saying 15-17% and we're now faster than V8/JSC, so it's pretty sweet.

Carsten Book [:Tomcat]

Comment 14

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/5c7b6e016f85 https://hg.mozilla.org/mozilla-central/rev/4d0df04fefbe https://hg.mozilla.org/mozilla-central/rev/1c682e6c1eb0 https://hg.mozilla.org/mozilla-central/rev/f3a6587bc94a https://hg.mozilla.org/mozilla-central/rev/9ff8c5acf4ea https://hg.mozilla.org/mozilla-central/rev/17c436c27035

Status: ASSIGNED → RESOLVED

Closed: 8 years ago

status-firefox55: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla55

You need to log in before you can comment on or make changes to this bug.