Some preliminary data from telemetry suggests that there may have been an increase in cycle collector times from about 50ms to 100ms, starting around 10/12 or 10/13. This is are the things that landed in that range: http://hg.mozilla.org/mozilla-central/pushloghtml?startdate=2011-10-11&enddate=2011-10-14 The thing we want to look for is anything that could have caused a slight leak of gray JS or disconnected DOMs. This is within a couple of weeks when bug 696761 was filed, about the IRCcloud problems, so that may or may not be related. One suspicious bug in the range is Bug 692884, which landed on the 12th.
From our ongoing 10 CC investigations, we have a really crude reproduction process, that basically involves opening a bunch of gmail or Zimbra windows with one persons profile and waiting for a day. They've started working on that to try to narrow down the range a bit.
Speculatively marking it with both MemShrink and Snappy.
Err... by "landed on the 12th", I meant, "landed on the 11th at 2am, but probably didn't make it until the 12th Nightly". Bug 632064 ("remove JS_GetScopeChain") landed at the same time and is big and scary looking and involves XPConnect changes. Bug 690961 also touches XPConnect, but it also was backported to 8 and 9 so I guess that means it can't be the cause. Bug 677411 landed on the 11th at 4pm. It touches the GC, but it just affects logging. I guess a silly mistake in there could cause a problem.
10/11 Nightly: http://hg.mozilla.org/mozilla-central/rev/ccea01542d0b 10/12 Nightly: http://hg.mozilla.org/mozilla-central/rev/e0ae39a3298e 10/13 Nightly: http://hg.mozilla.org/mozilla-central/rev/46a6d0fd13d5 10/14 Nightly: http://hg.mozilla.org/mozilla-central/rev/9545b88eed82
Another XPConnecty thing from the 10/12 nightly by Luke: Bug 690825. All these bugs I have listed are in the 10/12 Nightly. I'll start looking at 10/13 and see if there's anything suspicious. Luke, you had a very productive Oct 11. ;) Do any of your bugs from there that I linked here (bug 690825 and bug 632064) seem like something that maybe could leak? We haven't 100% narrowed it down to this day, but it seems suspicious. Frankly, of these bugs, Bug 692884 looks the most troublesome.
The only things from the 10/13 Nightly that look possibly troublesome is Bug 693815 - Disable jstracer. Seems unlikely that would cause a leak, but it is a large change.
Looking at the graph, the spike occurs specifically on the 13th, so maybe the things in the build on the 12th are off the hook.
This spike also was mostly on Linux, and a little on Mac. I think there wasn't an analysis of Windows.
This is the set of things landed in the 13th Nightly: http://hg.mozilla.org/mozilla-central/pushloghtml?startdate=2011-10-12&enddate=2011-10-13
Cww was able to reproduce his extreme CC problems on the 11th, so it isn't clear what this increase on the 13th was. We have no real way to investigate it.
Any chance to get cclog of that 11th build?
I have a couple of other slow CC logs from Cww and Blassey I keep meaning to send you. I'll do that this week. The only problem is they are so huge they are hard to analyze. Blassey's involved 700k JS objects, for instance.
I'll reopen this to track our ongoing regression testing. Cww has reproduced the issue on 10/11, was unable to reproduce it on 10/4. blassey was also unable to reproduce in 9/30. dolske had some minor CC cruddiness on 10/4, but not the super badness we've seen before. 10/5: http://hg.mozilla.org/mozilla-central/rev/70e4de45a0d0 10/6: http://hg.mozilla.org/mozilla-central/rev/8c82de08425d 10/7: http://hg.mozilla.org/mozilla-central/rev/c3a50afc2243 10/8: http://hg.mozilla.org/mozilla-central/rev/6c780dcb4b99 10/9: http://hg.mozilla.org/mozilla-central/rev/b4da2d439cbc 10/10: http://hg.mozilla.org/mozilla-central/rev/e9c620a5c85f Nothing really leaps out at me as being suspicious on any of 10/5 to 10/10 that I could see. On 10/11, the new DOM bindings went in. That seems like it could be a potential source of problems.
P2 because it's not clear if this affects the current release
Closing this. 11 actually looks a little worse than 10, from telemetry, but 12 is much better than both, and we're not going to be able to do anything for 10 or 11 at this point.