Closed Bug 493399 Opened 15 years ago Closed 15 years ago

Activity Manager: UI fails over with really long things in the activity list

Categories

(Thunderbird :: Mail Window Front End, defect)

defect
Not set
normal

Tracking

(thunderbird3.0 .1-fixed)

VERIFIED FIXED
Thunderbird 3
Tracking Status
thunderbird3.0 --- .1-fixed

People

(Reporter: davida, Assigned: bwinton)

Details

(Whiteboard: [tb3ride-along][fixed RC1 build 2])

Attachments

(2 files, 2 obsolete files)

Attached image screenshot
Something's not optimized right w/ the activity manager UI if there are hundreds of activities showing (as happens w/ current gloda reporting of a new gmail profile w/ tens of thousands of emails autosyncing)


I get the "unresponsive script" dialog a few times before the ui shows up.

Script: chrome://messenger/content/activity.xml:264
Script: file:///Users/davida/Mozilla/builds/c-c/obj-i386-apple-darwin9.6.0-opt/mozilla/dist/Shredder.app/Contents/MacOS/components/nsActivityManager.js:171

and the CPU is pegged at 95%

Two obvious options:
1) don't populate the UI in onload, but do it async, w/ enough yields to let the UI catch up
2) don't show _everything_.

(note that there are bugs to reduce the # of things showing (e.g. bug 493397), which will mitigate the frequency of this bug happening, but we probably want to do one of these two anyway for very-long-running sessions)
Flags: blocking-thunderbird3+
Blake, isn't this fixed by some of your earlier work?
Kinda.  There was Bug 495113, but that was for clearing the activity manager.

We might be running into similar n-squared behaviour for other operations, though, if this is still occurring.  Fortunately, if it has the same cause, then we should just be able to set _ignoreNotifications to true in the problem areas.

I looked in the code from around that time (revisions 2161 and 2565), and activity.xml:264 is the second half of:
document.getAnonymousElementByAttribute(this, "anonid",
  anonid).setAttribute('hidden', !visible);

while nsActivityManager.js:171 is the return in:
  getActivity: function(aID) {
    if (!this._activities[aID])
      throw Cr.NS_ERROR_NOT_AVAILABLE;
    return this._activities[aID];
  },

Neither of those seem particularly related to clearing, but then again, neither of those jump out as the cause of the UI failure.
I don't think this blocks anymore, given your prior work.
Flags: blocking-thunderbird3+ → blocking-thunderbird3-
So, performance is actually cubic in the initial open case and I'm re-nominating this for a blocker.

When opening the activity manager after having thunderbird having been running for several days I get a unified backtrace like so:

 9 0x7f099033f3a8 in nsJSIID::HasInstance at mozilla/js/src/xpconnect/src/xpcjsid.cpp:600
JS 0x7f0964ff43a6 in get_children at chrome://global/content/bindings/richlistbox.xml:303
JS 0x7f0965aba2e0 in getItemAtIndex at chrome://global/content/bindings/richlistbox.xml:173
JS 0x7f0962c1c681 in <anon> at chrome://messenger/content/activity.js:102
JS 0x7f09646dddb9 in <anon> at chrome://messenger/content/activity.js:139
JS 0x7f09646de5b3 in <anon> at chrome://messenger/content/activity.js:240
JS 0x7f096defaa6e in onload at chrome://messenger/content/activity.xul:1
22 0x7f098a6e4eee in nsJSContext::CallEventHandler at mozilla/dom/src/base/nsJSEnvironment.cpp:2085

- The first O(n) is activity.js:240 = startup's O(n) calls (one for each activity item being added) to activity.js:139 = addActivityBinding which calls getActivityGroupBindingByContext.

- The second O(n) is activity.js:102 = getActivityGroupBindingByContext.  It calls this._activitiesView.getItemAtIndex(i) once for each existing item.

- The third O(n) is a bit of a surprise; the getter for children (which getItematIndex uses) looks like this:
  var childNodes = [];
  for (var child = this.firstChild; child; child = child.nextSibling) {
    if (child instanceof Components.interfaces.nsIDOMXULSelectControlItemElement)
      childNodes.push(child);
  }
  return childNodes;

All of these multiply together to get us O(n^3), and an expensive one at that considering that inner O(n)'s instanceof check does not look particularly cheap from the frames callstack I elided.

An immediate stop-gap measure would be to have getActivityGroupBindingByContext just get a copy of 'children' itself and iterate over that, then we are only O(n^2).  However, I would suggest that getActivityGroupBindingByContext should really be O(1).  And unless the activity manager back-end has an internal limit to the number of remembered activities, it really should be time-slicing the insertion.

At the time of fixing I would suggest that the other code paths also be evaluated.
Flags: blocking-thunderbird3- → blocking-thunderbird3?
Another approach (possibly easier?) would be to have the activity manager only keep track of 1000 (or 100, or ...) entries.  I doubt keeping all activities over days is actually useful.
I think a hard limit of 1000 with FIFO ejection seems reasonable, but that's still too many for even an O(n^2) algorithm with the actual constant factors we are talking about.

Setting a lower cap is reasonable, but unless we have prioritized eviction, it might hurt our ability to debug.  I was opening the dialog because I think I was the victim of multiple UID validity rolls sometime in the past few days.  I think with only 100 entries I am unlikely to see the primary indicators of a UID validity roll.
I don't think I'd block on this, though I certainly agree the activity manager has some bad scaling issues.  I think prioritized eviction is the right thing to do in general, but I suspect we have a ton of equally low priority events and very few high priority events, like server errors, and the primary indicators of unexpected events might still get lost in the noise.
Not going to block on this but we'll probably take a patch if its well tested etc.
Flags: blocking-thunderbird3? → blocking-thunderbird3-
Whiteboard: [tb3ride-along]
Assignee: nobody → bwinton
So, I'm not sure if this is O(1), but it should still be O(something small).
Was this the sort of thing you were thinking of?

(While I wait, I'll be adding some code to limit the number of items to a value.)
Attachment #412740 - Flags: review?(bugmail)
clearActivityList and removeActivityBinding still use the O(n^2) style of operation.  I think _activitiesView.children can be saved off and directly iterated over, reducing both to O(n)...
(I'm running into some problems with the limiting, and it's late here, so I'm going to go to bed, and work on it more tomorrow.)
Status: NEW → ASSIGNED
Let's give this one a go.

It's got a limit, and _activitiesView.children, and a _groupCache, so it should be fairly decent.

Thanks,
Blake.
Attachment #412740 - Attachment is obsolete: true
Attachment #412851 - Flags: review?(bugmail)
Attachment #412740 - Flags: review?(bugmail)
r=asuth via irc, with the added changes.
Attachment #412851 - Attachment is obsolete: true
Attachment #412860 - Flags: review+
Attachment #412860 - Flags: approval-thunderbird3?
Attachment #412851 - Flags: review?(bugmail)
Attachment #412860 - Flags: approval-thunderbird3? → approval-thunderbird3+
Checked in:
comm-central: http://hg.mozilla.org/comm-central/rev/bf0582fdf9ac
comm-1.9.1 (default): http://hg.mozilla.org/releases/comm-1.9.1/rev/62848bcc7051
comm-1.9.1 (relbranch): http://hg.mozilla.org/releases/comm-1.9.1/rev/31dbe7e912dc
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Whiteboard: [tb3ride-along] → [tb3ride-along][fixed RC1 build 2]
Target Milestone: --- → Thunderbird 3
Status: RESOLVED → VERIFIED
Whiteboard: [tb3ride-along][fixed RC1 build 2] → [tb3ride-along][fixed RC1 build 2][fixedtb301]
Whiteboard: [tb3ride-along][fixed RC1 build 2][fixedtb301] → [tb3ride-along][fixed RC1 build 2]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: