Closed Bug 1351303 Opened 3 years ago Closed 3 years ago

NS_Atomize is too slow

Categories

(Core :: XPCOM, enhancement)

50 Branch
enhancement
Not set

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox55 --- fixed

People

(Reporter: smaug, Assigned: smaug)

References

Details

Attachments

(2 files, 1 obsolete file)

I have a profile when atomizing takes 5% of innerHTML time, and of that 33% is the mutex.
I guess we need to add some fast path for main thread or something.
(In reply to Olli Pettay [:smaug] (work week, slower than usual reviews) from comment #0)
> I have a profile when atomizing takes 5% of innerHTML time, and of that 33%
> is the mutex.
> I guess we need to add some fast path for main thread or something.

Are you able to share the profile?
Ideally we'd have a non-main-thread-only solution here since HTML parsing happens off the main thread too.

One option would be to have a HTML-parser specific hash table which holds owning references to atoms for all HTML element/attribute names. That hash table can be fully lock-free since it's immutable after initial creation. We would have to fall back to doing a normal NS_Atomize if the parser-specific hash table doesn't find an atom since the page might be using a non-standard element/attribute. But that should be a much more rare case.
The issue here comes when parser creates the actual DOM attributes in the main thread, especially when
creating the atom array for class attribute. So I'm less worried about parser thread.
Initial testing shows that a local main thread only cache (size 31) can easily catch over 50% of atoms, in some microbenchmarks 99% or so.
Attached patch wip (obsolete) — Splinter Review
Not sure all those NS_AtomizeMainThread calls are needed.
nsAttrValue::ParseAtomArray is after all the really bad case where this shows up.
But the cache does seem to catch many cases in normal browsing.
Assignee: nobody → bugs
Attached patch wip2Splinter Review
Attachment #8852294 - Attachment is obsolete: true
Comment on attachment 8852295 [details] [diff] [review]
wip2

https://treeherder.mozilla.org/#/jobs?repo=try&revision=d8cd775b2a0543ab7fd0576443a9917fec4d2f76


31 is magical prime number, we happen to use the same for content list cache.
Attachment #8852295 - Flags: review?(nfroyd)
Comment on attachment 8852295 [details] [diff] [review]
wip2

Review of attachment 8852295 [details] [diff] [review]:
-----------------------------------------------------------------

r+'ing a patch that's just called "wip2" seems wrong, so have an f+.

::: xpcom/ds/nsAtomTable.cpp
@@ +365,5 @@
>  //----------------------------------------------------------------------
>  
> +#define RECENTLY_USED_MAIN_THREAD_ATOM_CACHE_SIZE 31
> +static nsIAtom*
> +  sRecentlyUsedMainThreadAtoms[RECENTLY_USED_MAIN_THREAD_ATOM_CACHE_SIZE] = {};

The motivating idea behind this is that twiddling things on the main thread will tend to be repeated:

* element names,
* attribute names,
* attribute values (e.g. @class)

and things are repeated frequently enough and/or closely enough together that we think we can get a decent hit rate out of the cache?  Might be worth expanding on this rationale in the comments.

I'm a little surprised by the measurements that say you can get a 50% hit rate...what pages did you test this on?  I would find it more believable if we had separate caches for various "kinds" of atoms, but still surprising.

@@ +373,5 @@
>  {
> +  if (NS_IsMainThread()) {
> +    MutexAutoLock lock(*gAtomTableLock);
> +    GCAtomTableLocked(lock, GCKind::RegularOperation);
> +  }

How much are we actually twiddling atoms off the main thread?  I guess we don't do it at all right now except for Stylo builds, though I think Henri had expressed interest in making the HTML parser use this?  Or maybe the parser already does?  I can't remember.

In any event, if we start twiddling things more and more, we risk significant growth in the unused atoms table, which seems not so great.
Attachment #8852295 - Flags: review?(nfroyd) → feedback+
(In reply to Nathan Froyd [:froydnj] from comment #8)
> I'm a little surprised by the measurements that say you can get a 50% hit
> rate...what pages did you test this on? 
any random pages I happen to use. And also browser chrome of course.


> 
> How much are we actually twiddling atoms off the main thread?
bholley wanted to keep the possibility to create atoms off the main thread.


> In any event, if we start twiddling things more and more, we risk
> significant growth in the unused atoms table, which seems not so great.
bholley said stylo shouldn't be using this much.
And as of now, atomization is way too slow, so I think better to fix it now for the use cases we have.
We can then later, if needed, add a runnable which runs GC in main thread
Comment on attachment 8852295 [details] [diff] [review]
wip2

-m "Bug 1351303, add main thread only cache for nsIAtoms to speed up atomization, r=froydnj"
Attachment #8852295 - Flags: review?(nfroyd)
Additional fix for parser. 
I would merge the patches when landing and use the same commit message :)
Attachment #8853152 - Flags: review?(nfroyd)
FWIW, with both patches loading browser UI gives
noncached 1497, cached 8011
Attachment #8853152 - Flags: review?(nfroyd) → review+
Comment on attachment 8852295 [details] [diff] [review]
wip2

Review of attachment 8852295 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with the comment discussing the rationale for this.
Attachment #8852295 - Flags: review?(nfroyd) → review+
Pushed by opettay@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/1825773142a0
add main thread only cache for nsIAtoms to speed up atomization, r=froydnj
https://hg.mozilla.org/integration/mozilla-inbound/rev/8c5b8d43d64f
make HTML parser to use faster atomization in main thread, r=froydnj
https://hg.mozilla.org/mozilla-central/rev/1825773142a0
https://hg.mozilla.org/mozilla-central/rev/8c5b8d43d64f
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Blocks: 1353811
You need to log in before you can comment on or make changes to this bug.