spend too much time refcounting atoms

RESOLVED FIXED in mozilla0.9.6

Status

()

Core
XPCOM
P3
normal
RESOLVED FIXED
17 years ago
15 years ago

People

(Reporter: dbaron, Assigned: dbaron)

Tracking

(Blocks: 1 bug, {perf})

Trunk
mozilla0.9.6
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(5 attachments)

(Assignee)

Description

17 years ago
AtomImpl, our implementation of nsIAtom, was originally not threadsafe, and
layout used atoms extensively under the assumption that refcounting atoms was
cheap.  Then on March 5, 2000, warren changed atoms to use threadsafe
refcounting.  This means we do lots of expensive locking in layout to refcount
atoms even though layout operates entirely on the main thread.

I tried a quick hack (I'll attach the patch below) to get some performance
numbers on the cost of refcounting atoms.  On btek (Linux, RedHat 6.2, dual
Pentium 500, 256MB RAM) I ran jrgm's tests (run of 5 repeats, cached, after 1
run to get the pages into the cache) and i-bench on a clean profile with the
sidebar hidden, with and without the patch.  I saw the numbers on jrgm's tests
go from 1709ms median / 1710ms average to 1648ms median / 1649ms average, and
the numbers on i-bench go from 305.2/38.87/38.05 (total / uncached / avg of 7
cached) to 292.14/35.59/36.65.  Those are improvements of 3.6% and
4.8%/8.4%/3.7% respectively.  (The uncached i-bench run may not be statistically
significant, but the ~3.6% improvement seems more reliable.  I generally see
very low variation with jrgm's tests, so the change certainly seems significant,
although I didn't repeat the run.)

I see a few possible solutions here:

 * have two separate atom interfaces - one threadsafe and one for the main
thread only

 * stop refcounting atoms, but still pretend we do in the interface, but just
keep them around until the end of XPCOM shutdown

 * the above, but stop pretending we do as well (although we probably use atoms
through JS already, so this would be a bad thing)

 * decide we're willing to pay this performance penalty for where we use atoms
and try to use atoms less

Thoughts?
(Assignee)

Updated

17 years ago
Status: NEW → ASSIGNED
Keywords: perf
Priority: -- → P3
Target Milestone: --- → mozilla1.0.1
(Assignee)

Comment 1

17 years ago
I (or someone else) should measure how many atoms we currently destroy before
XPCOM shutdown (or before we close the last window, anyway).
(Assignee)

Comment 2

16 years ago
Created attachment 54036 [details] [diff] [review]
a possible real solution

Comment 3

16 years ago
I like it! If it works (add a test to TestAtoms.cpp or something), let's do it!
r=waterson

Comment 4

16 years ago
Comment on attachment 54036 [details] [diff] [review]
a possible real solution

r=waterson
Attachment #54036 - Flags: review+
(Assignee)

Comment 5

16 years ago
Created attachment 54037 [details] [diff] [review]
with tests, and crucial typo fix
(Assignee)

Comment 6

16 years ago
The tests work, after I fixed NS_NewPermanentAtom to create permanent atoms in
the case where there wasn't already a non-permanent atom.  (The old patch did
show performance improvement because we probably call NS_NewAtom more than once
for most/all of the atoms that are performance hotspots.)
(Assignee)

Comment 7

16 years ago
I also made the same one-line NS_NewPermanentAtom change to:

layout/mathml/content/src/nsMathMLAtoms.cpp
layout/svg/content/src/nsSVGAtoms.cpp
widget/src/xpwidgets/nsWidgetAtoms.cpp

Comment 8

16 years ago
Comment on attachment 54037 [details] [diff] [review]
with tests, and crucial typo fix

r=waterson. nice!
Attachment #54037 - Flags: review+
Comment on attachment 54037 [details] [diff] [review]
with tests, and crucial typo fix

sr=shaver, but maybe file a bug on the scriptable-atom thing?
Attachment #54037 - Flags: superreview+

Updated

16 years ago
Blocks: 7251

Updated

16 years ago
Blocks: 105431
Target Milestone: mozilla1.0.1 → mozilla0.9.6
(Assignee)

Comment 10

16 years ago
The patch was also missing an |#include "nsAtomTable.h"| added to nsXPComInit.cpp.
(Assignee)

Comment 11

16 years ago
And it crashes on shutdown on Windows (not sure why it doesn't on Linux) because
I was removing entries from the hashtable during enumeration.  And Brendan has
convinced me to switch it to pldhash...
(Assignee)

Comment 12

16 years ago
Created attachment 54180 [details] [diff] [review]
proposed patch with pldhash
(Assignee)

Comment 13

16 years ago
Yet another typo:  I meant to actually use my AtomTableClearEntry, and it should
have been PR_STATIC_CALLBACK, which would have made me catch that I wasn't.
(Assignee)

Comment 14

16 years ago
Created attachment 54190 [details] [diff] [review]
cleaned up patch
(Assignee)

Comment 15

16 years ago
I needed to reorder AtomTableClearEntry to put the clearing before the deleting
because it seems that the hashtable is resizing as it is being destroyed -- I
see a crash on Windows otherwise, and that's the only explanation I can think
of.  I won't attach a new patch.  The function now looks like:

AtomTableClearEntry(PLDHashTable *table, PLDHashEntryHdr *entry)
{
  AtomTableEntry *he = NS_STATIC_CAST(AtomTableEntry*, entry);
  AtomImpl *atom = he->mAtom;
  he->mAtom = 0;
  he->keyHash = 0;
  if (atom->IsPermanent())
    delete atom;
}

Updated

16 years ago
Attachment #54190 - Flags: superreview+
Comment on attachment 54190 [details] [diff] [review]
cleaned up patch

Cool -- any idea of the perf win yet?

Nits/comments:

- Tune alpha min and max to optimize memory utilization?

- Not your error, but could you fix kipp's "it's" glitch here?

* C string is translated into it's unicode equivalent.

- Use a static PLDHashTable to avoid a needless (tho small and one-off) malloc?

+static PLDHashTable* gAtomTable;

You could use nullness of gAtomTable.ops to bootstrap.

- Space after comma below:

+  return nsCRT::HashCode(NS_STATIC_CAST(const PRUnichar*,key));

- Horrors, 0 == lame!

+    if (0 == gAtomTable->entryCount) { 

- Typo, duplicated a after A:

+ * A a threadsafely-refcounted implementation of nsIAtom.  Note that

Fix those, and sr=brendan@mozilla.org

/be
See PL_DHashTableFinish -- the table does not resize as it is being destroyed,
but could ~AtomImpl wrongly thing that its |this| parameter is impermanent, and
call PL_DHashTableOperate(gAtomTable, mString, PL_DHASH_REMOVE), in the case
where you crashed?

/be
(Assignee)

Comment 18

16 years ago
> - Tune alpha min and max to optimize memory utilization?

The numbers of atoms seem to hover in the low 2000's, so I think most reasonable
bounds would lead to a table of size 4096.  Perhaps I should initialize to that
rather than 2048, although the numbers may be lower without MathML and SVG.

> See PL_DHashTableFinish -- the table does not resize as it is being destroyed,
> but could ~AtomImpl wrongly thing that its |this| parameter is impermanent,
> and call PL_DHashTableOperate(gAtomTable, mString, PL_DHASH_REMOVE), in the
> case where you crashed?

~AtomImpl should only be called by AtomTableClearEntry for permanent atoms, and
the remove should only be executed from ~AtomImpl for non-permanent atoms.  Hmm.
(Assignee)

Comment 19

16 years ago
Created attachment 54268 [details] [diff] [review]
patch addressing Brendan's comments
(Assignee)

Comment 20

16 years ago
Windows jrgm test results (avg median/average/min/max):

baseline: 901/(978)/274/(3409) (cache not primed first run)
baseline: 895/889/280/2974
baseline: 899/895/278/3056

atoms: 876/875/291/2925
atoms: 882/879/295/2975

These look like an improvement in the range 2%-2.5%, which is reasonable
considering I saw 3.5% numbers on Linux in my original tests.  I should probably
rerun the numbers on Linux as well.

Comment 21

16 years ago
In AtomImpl::~AtomImpl,

+    if (gAtomTable.entryCount == 0) {
+      PL_DHashTableFinish(&gAtomTable);
+      gAtomTable.entryCount = 0;

Will PL_DHashTableFinish mutate the entryCount? If so, explain why in a comment.
Otherwise, the assign seems like it's unnecessary.

If I understand correctly, a ``normal'' atom will be destroyed by the
implementation of AtomImpl::Release(). Could you add a comment to
AtomTableClearEntry to that effect to explain the conditional destruction? I.e.,
before |if (atom->IsPermanent())|.

Fix those, and r= or sr=waterson.
(Assignee)

Comment 22

16 years ago
> Will PL_DHashTableFinish mutate the entryCount? If so, explain why in a
> comment.  Otherwise, the assign seems like it's unnecessary.

Assignment changed to an assertion that it is already 0, to document that the
code is assuming the implementation of PL_DHashTableFinish doesn't do anything
weird.

> If I understand correctly, a ``normal'' atom will be destroyed by the
> implementation of AtomImpl::Release(). Could you add a comment to
> AtomTableClearEntry to that effect to explain the conditional destruction?
> I.e., before |if (atom->IsPermanent())|.

Comment added.
(Assignee)

Comment 23

16 years ago
Fix checked in 2001-10-20 16:18/19 PDT.
Status: ASSIGNED → RESOLVED
Last Resolved: 16 years ago
Resolution: --- → FIXED
(Why would little ol' PL_DHashTableFinish do anything weird? :-)

/be
You need to log in before you can comment on or make changes to this bug.