Closed Bug 377787 Opened 18 years ago Closed 14 years ago

Cycle collector performance tracking

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: bzbarsky, Assigned: chofmann)

References

Details

(Keywords: meta, perf)

Attachments

(1 file, 1 obsolete file)

Zipped-up profile as of today 18 years ago Boris Zbarsky [:bzbarsky] 590.78 KB, application/zip		Details
Updated profile 18 years ago Boris Zbarsky [:bzbarsky] 70.69 KB, application/zip		Details

Boris Zbarsky [:bzbarsky]

Reporter

Description

•

18 years ago

This is a tracking bug for performance work on the cycle collector.

Peter Van der Beken [:peterv]

Updated

•

18 years ago

Depends on: 377606

Jesse Ruderman

Updated

•

18 years ago

Keywords: meta, perf

OS: Linux → All

Hardware: PC → All

Boris Zbarsky [:bzbarsky]

Reporter

Comment 1

•

18 years ago

Attached file Zipped-up profile as of today (obsolete) — Details

This is a profile of a build with the patches to bug 373693 applied. To get this, I loaded a large bonsai query in one window, and browsed in another one. Every so often I moused over the bonsai window to trigger traversal of all that stuff. I started and stopped jprof programmatically around the three calls to nsCycleCollector_collect in nsJSEnvironment; if we have other entry points into it (other than shutdown), I'd love to know. The profile largely shows hashtable stuff, actually. I should try the patch in bug 377606; didn't realize it was there.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 2

•

18 years ago

Actually, looks like I can't apply that on top of the patch to bug 373693. I'll just wait for more stuff to land.

Boris Zbarsky [:bzbarsky]

Reporter

Updated

•

18 years ago

Attachment #261877 - Attachment is patch: false

Attachment #261877 - Attachment mime type: text/plain → application/zip

echoes

Comment 3

•

18 years ago

changing Browser.sessionhistory.max_total_viewers in about:config from -1 to 3 produced a noticeable performance increase. perhaps an at least temporary "workaround" would be to submit a patch that changes it to (at least) 3 upon install and maybe if its possible changes it if ff is already installed. Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9a4pre) Gecko/20070417 Minefield/3.0a4pre - Build ID: 2007041704

Peter Van der Beken [:peterv]

Updated

•

18 years ago

Depends on: 377884

Peter Van der Beken [:peterv]

Updated

•

18 years ago

Depends on: 372110

Boris Zbarsky [:bzbarsky]

Reporter

Comment 4

•

18 years ago

Attached file Updated profile — Details

This was generated the same way that the profile in comment 1 was generated, but in a build with the patches for bug 372110, bug 373693, bug 374872, bug 377606 applied (basically, all the patches checked in as of right now as far as I can tell). Every so often I hovered the bonsai page to make sure it was purple (from the mousemove event dispatch). There are 455673 hits under nsCycleCollector::Collect, broken down mostly as as: 191121 nsCycleCollector::MarkRoots() 164912 nsCycleCollector::ScanRoots() 43962 nsCycleCollector::CollectWhite() 31329 nsXPConnect::BeginCycleCollection() 9144 PL_DHashTableFinish 7560 nsXPConnect::FinishCycleCollection() 6565 memset nsXPConnect::BeginCycleCollection is spent under JS_GC, memset (allocating the hashtable, I guess), and PL_DHashTableFinish. nsCycleCollector::CollectWhite just calls PL_DHashTableEnumerate, which pretty much just calls FindWhiteCallback. About 40% of the time is spent in FindWhiteCallback itself, another 30% under nsCycleCollector::Forget, 5% QI on nsTextNodes, 5% UnmarkPurple on generic elements and text nodes, 10% in QI on various HTML elements. nsCycleCollector::ScanRoots basically calls GraphWalker::Walk, which more or less calls nsCycleCollectionXPCOMRuntime::Traverse (there is a detour that gets there through nsXPConnect::Traverse calling GraphWalker::DescribeNode calling scanWalker::VisitNode calling GraphWalker::Walk). The time under there is basically split between: canonicalize() -- 24% of ScanRoots PL_DHashTableOperate -- 27% of ScanRoots, split up by caller more or less as: 50% -- GraphWalker::NoteXPCOMChild 25% -- GraphWalker::Walk 10% -- nsDocument::GetReference 10% -- nsBindingManager::GetBinding 3% -- GraphWalker::NoteScriptChild 2% -- nsContentUtils::TraverseListenerManager ToParticipant -- 10% of ScanRoots. nsDeque push and pop -- 8% of ScanRoots (from Walk and NoteXPCOMChild). ScanBlackWalker::ShouldVisitNode -- 2.5% of ScanRoots In (not under) GraphWalker::NoteXPCOMChild -- 2.5% of ScanRoots In (not under) nsGenericElement::cycleCollection::Traverse -- 4% of ScanRoots Various other non-hotspotty stuff. nsCycleCollector::MarkRoots calls GraphWalker::Walk, which mostly calls nsCycleCollectionXPCOMRuntime::Traverse, etc. From the bottom up, we have: canonicalize() -- 20% of MarkRoots PL_DHashTableOperate -- 37% of MarkRoots, split up by caller more or less as: 66% -- GraphWalker::NoteXPCOMChild 16% -- GraphWalker::Walk 6% -- nsDocument::GetReference 4% -- nsBindingManager::GetBinding 2% -- GraphWalker::NoteScriptChild 2% -- nsContentUtils::TraverseListenerManager ToParticipant -- 9% of MarkRoots. nsDeque push and pop -- 6% of MarkRoots (from Walk and NoteXPCOMChild). In (not under) GraphWalker::NoteXPCOMChild -- 2.5% of MarkRoots In (not under) nsGenericElement::cycleCollection::Traverse -- 3% of ScanRoots Various other non-hotspotty stuff, I think. Things that jump out at me: canonicalize() and ToParticipant() together make up 24% of the total cycle collection time. I wonder whether there's some way to at least cache the results between ScanRoots and MarkRoots. PL_DHashTableOperate makes up another 25% of the total. Some of these tables we could try to optimize a bit; for example, could we use bits on nodes to indicate whether they have a binding or a preserved wrapper so that we don't have to hit the hashtable for all the ones that do not? Basic problem really is that we're walking a _lot_ of stuff here. :( The pause I was profiling is about 3 seconds long on my machine, without jprof running. Note: I got some of the data above using -i and -e options to jprof; I'm not sure it's possible to get some of those numbers directly out of the profile that I'm attaching.

Attachment #261877 - Attachment is obsolete: true

Boris Zbarsky [:bzbarsky]

Reporter

Comment 5

•

18 years ago

Filed bug 378389 on the preserved wrapper table and bug 378390 on the binding manager table.

Depends on: 378389, 378390

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Updated

•

18 years ago

Depends on: 378514

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Comment 6

•

•

14 years ago

Status: NEW → RESOLVED

Closed: 14 years ago

Resolution: --- → FIXED

David Zbarsky (:dzbarsky)

Updated

•

13 years ago

No longer blocks: 698919

David Zbarsky (:dzbarsky)

Updated

•

13 years ago

Blocks: 698919

Nobody; OK to take it and work on it

Updated

•

9 years ago

Product: Core → Core Graveyard

You need to log in before you can comment on or make changes to this bug.