624549 - Overly aggressive calls to nsJSContext::MaybeCC() cause pauses in Flight of the Navigator

Reporter

Description

•

15 years ago

There are bad GC pauses in the Flight of the Navigator demo that seem to be caused by cycle collection happening too often: 0.0% 51.3% XUL nsUITimerCallback::Notify(nsITimer*) 0.0% 51.3% XUL nsObserverService::NotifyObservers(nsISupports*, char const*, unsigned short const*) 0.0% 51.3% XUL nsObserverList::NotifyObservers(nsISupports*, char const*, unsigned short const*) 0.0% 51.3% XUL nsUserActivityObserver::Observe(nsISupports*, char const*, unsigned short const*) 0.0% 51.3% XUL nsJSContext::MaybeCC(int) 0.0% 51.3% XUL nsJSContext::CC(nsICycleCollectorListener*) 0.0% 51.2% XUL nsXPConnect::GarbageCollect() 0.0% 51.2% XUL nsXPConnect::Collect() 0.0% 51.2% XUL js_GC(JSContext*, JSCompartment*, JSGCInvocationKind)

Patrick Walton (:pcwalton)

Reporter

Comment 1

•

15 years ago

Maybe nsUserActivityObserver::Observe() should avoid calling MaybeCC() if animations are going on? We could check the paint count and if it's continually rising we can avoid calling MaybeCC().

Patrick Walton (:pcwalton)

Reporter

Updated

•

15 years ago

Summary: Overly aggressive calls to nsJSContext::MaybeCC() causes pauses in Flight of the Navigator → Overly aggressive calls to nsJSContext::MaybeCC() cause pauses in Flight of the Navigator

Patrick Walton (:pcwalton)

Reporter

Updated

•

15 years ago

Assignee: nobody → pwalton

Brendan Eich [:brendan]

Comment 2

•

15 years ago

This bug (and pcwalton's frame rate tool + Shark) are hawt :-|. /be

Patrick Walton (:pcwalton)

Reporter

Comment 3

•

15 years ago

Attached patch Proposed patch. (obsolete) — Details — Splinter Review

Proposed patch attached. This simplifies our CC logic to just two timers: * The "opportunistic" timer, which fires once every 5 seconds after any "user interaction event", which is defined as a UI event or a repaint. * The "guaranteed" timer, which fires every 5 minutes. Each time any of these timers fires, a cycle collection is performed. This tries to ensure that cycle collections happen soon after any UI event, but don't block the events. At the same time, it makes sure that cycle collections *will* happen eventually, no matter what, to guard against a persistent animation starving the CC.

Attachment #502709 - Flags: feedback?(gal)

Patrick Walton (:pcwalton)

Reporter

Comment 4

•

15 years ago

Feedback from smaug would be appreciated too: I'm sure I missed some edge cases in there :)

Andreas Gal :gal

Comment 5

•

15 years ago

Comment on attachment 502709 [details] [diff] [review] Proposed patch. This looks awesome. Change GC to CC in the constants. Sometimes you indent by 4. Also + return mGuaranteedTimer->InitWithCallback(static_cast<nsITimerCallback*(this), NS_GUARANTEED_GC_DELAY, + nsITimer::TYPE_ONE_SHOT); looks a lot nicer than ( overhand.

Attachment #502709 - Flags: feedback?(gal) → feedback+

Boris Zbarsky [:bzbarsky]

Updated

•

15 years ago

Attachment #502709 - Flags: feedback?(Olli.Pettay)

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 6

•

15 years ago

Have you posted the patch to tryserver? It is, or at least was, *very* easy to regress at least tp4 when changing CC scheduling. And Fennec may not want a "guaranteed timer", since IIRC the aim has been to remove all unnecessary timers which run when the browser is otherwise idle.

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 7

•

15 years ago

And the patch makes us call CC even in cases there are no new suspected objects. If user moves mouse all the time, CC is called after 5 mins? That is a long time. *Lots* of garbage can be created during that time. I mean hundreds of megabytes. And the 5 mins is also the time how often CC is called when user isn't active? The user-interaction-active notification in nsDisplayListBuilder::EnterPresShell is strange. Painting has nothing to do with user interaction. And I'd guess that notifying during painting could show up in the profiles. If something like that is needed, perhaps nsDisplayListBuilder could have some timestamp when the last paint happened and CC scheduling could use that? How does https://bugzilla.mozilla.org/show_bug.cgi?id=517665#c16 behave with the patch? (Remember to not interact with the browser at all while running that test.)

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 8

•

15 years ago

Btw, what is the URL for the testcase for this bug?

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 9

•

15 years ago

(In reply to comment #7) > How does https://bugzilla.mozilla.org/show_bug.cgi?id=517665#c16 > behave with the patch? Actually, the patch shouldn't affect to this after all.

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 10

•

15 years ago

Attached file a testcase for CC — Details

This creates lots of cycle collectable objects.

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 11

•

15 years ago

(In reply to comment #10) > Created attachment 502773 [details] With the patch it is easy to OOM.

David Humphrey (:humph)

Comment 12

•

15 years ago

(In reply to comment #8) > Btw, what is the URL for the testcase for this bug? http://videos.mozilla.org/serv/mozhacks/flight-of-the-navigator/

URL: http://videos.mozilla.org/serv/mozhac...

Mike Shaver (:shaver emeritus)

Comment 13

•

15 years ago

(In reply to comment #10) > Created attachment 502773 [details] > a testcase for CC > > This creates lots of cycle collectable objects. Without the patch, running that test for a few seconds and then closing the tab causes my I'm-pretty-sure-CC-related periodic pauses to Get Big. Do we not do a CC on tab close or bfcache expiry? Even after like 5 minutes, they haven't gone away and I'm still at 1.6GB of memory.

Patrick Walton (:pcwalton)

Reporter

Comment 14

•

15 years ago

(In reply to comment #7) > And the patch makes us call CC even in cases there are no new suspected > objects. > > If user moves mouse all the time, CC is called after 5 mins? > That is a long time. *Lots* of garbage can be created during that time. > I mean hundreds of megabytes. > And the 5 mins is also the time how often CC is called when user isn't > active? I'm dropping it down to 1 minute after a conversation with Andreas. Do you think that's better? > And Fennec may not want a "guaranteed timer", since IIRC the aim has been > to remove all unnecessary timers which run when the browser is otherwise idle. How about keying it to user interaction events (i.e. every so often, a cycle collection gets automatically run)?

Patrick Walton (:pcwalton)

Reporter

Comment 15

•

15 years ago

That is, after N user interaction events, a cycle collection is always run. Also, IMHO regressing tp4 is something we should consider taking. It seems to me that this is a big win for users: essentially, the status quo right now is that all benefits of compartmental GC are lost due to aggressive cycle collection. I haven't found a single page that benefits in a significant way from compartmental GC without this patch, because the CCs constantly trigger full GC.

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 16

•

15 years ago

(In reply to comment #15) > That is, after N user interaction events, a cycle collection is always run. That is the current behavior, although currently there are other parameters too, like the number of new suspected objects, and never run CC more often than every 10 seconds. > Also, IMHO regressing tp4 is something we should consider taking. I agree. One reason for the complexity of CC scheduling parameters is the do-not-regress-any-perf-test. > It seems to > me that this is a big win for users: essentially, the status quo right now is > that all benefits of compartmental GC are lost due to aggressive cycle > collection. Is the time really spent in cycle collection or in garbage collection? > I haven't found a single page that benefits in a significant way > from compartmental GC without this patch, because the CCs constantly trigger > full GC. The problem with the patch is that it really doesn't handle cases when lots of cycle collectable garbage (not JS objects) is created. Have you debugged what causes CC to run? I mean are there lots of new suspected objects or what? Could we just increase NS_MIN_SUSPECT_CHANGES and/or NS_MAX_SUSPECT_CHANGES significantly?

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 17

•

15 years ago

(In reply to comment #13) > Without the patch, running that test for a few seconds and then closing the tab > causes my I'm-pretty-sure-CC-related periodic pauses to Get Big. Do we not do > a CC on tab close or bfcache expiry? No. Currently we try to do CC when user is inactive and/or there are some new suspected objects. But if the pauses stay really long, that sounds like something is keeping the document alive. The testcase creates a *huge* document.

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 18

•

15 years ago

(In reply to comment #12) > http://videos.mozilla.org/serv/mozhacks/flight-of-the-navigator/ Unfortunately this doesn't work on my Linux machine (because of WebGL). I'll test this all on Mac.

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 19

•

15 years ago

Patrick, or anyone, have you tried flight-of-the-navigator with the patch for Bug 565217?

Brendan Eich [:brendan]

Comment 20

•

15 years ago

We saw MaybeCC invoked full-JS-GC more often than every 10 seconds during FotN with Patrick's frame-rate monitor, which automatically sharks the culprit(s). So something is already buggy there (on file?). Doing CC on a fixed timer seems wrong in any scenario. It'll chew battery. It will jank your demo at the worst possible moment, Murphy sez. /be

Patrick Walton (:pcwalton)

Reporter

Comment 21

•

15 years ago

(In reply to comment #16) > Have you debugged what causes CC to run? I mean are there lots of > new suspected objects or what? Lots of user-interaction-inactive I believe; I'll add some debugging in and report back. The reason I replaced the logic with a timer is that user-interaction-active was triggering calls to MaybeCC(), which struck me as odd - it meant that moving the mouse over a page rapidly would cause cycle collection, unless I'm misunderstanding the code. > Could we just increase NS_MIN_SUSPECT_CHANGES and/or NS_MAX_SUSPECT_CHANGES > significantly? I do think that we want to add some kind of animation heuristic. Interrupting an animation to do CC hurts our perception of performance, even when no user interaction is occurring. This is the main problem in FOTN.

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 22

•

15 years ago

(In reply to comment #20) > Doing CC on a fixed timer seems wrong in any scenario. It'll chew battery. It > will jank your demo at the worst possible moment, Murphy sez. Currently CC doesn't run on a fixed timer. When user is active or just became inactive, there is a timer which fires the active/inactive notifications. Otherwise CC is triggered by thread observer. Well, page loads can also trigger CC.

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 23

•

15 years ago

(In reply to comment #21) > Lots of user-interaction-inactive I believe; I'll add some debugging in and > report back. Ah, that is something we should tweak. Maybe the time from user-interaction-inactive to cycle collection should be quite long. > The reason I replaced the logic with a timer is that user-interaction-active > was triggering calls to MaybeCC(), which struck me as odd - it meant that > moving the mouse over a page rapidly would cause cycle collection, unless I'm > misunderstanding the code. CC shouldn't run more often than every 10 seconds, and it should run only if there are some new suspected objects. It is very possible that the value of those constants aren't what we want. > I do think that we want to add some kind of animation heuristic. Interrupting > an animation to do CC hurts our perception of performance, even when no user > interaction is occurring. This is the main problem in FOTN. But we must CC every now and then if lots of cycle collectable garbage is created.

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 24

•

15 years ago

See also https://bugzilla.mozilla.org/show_bug.cgi?id=565217#c7 "I was hoping to run the CC a lot more often with this patch, since we can stop it at any time."

Andreas Gal :gal

Comment 25

•

15 years ago

Keep in mind that I have a patch up for review that makes the cycle collector interruptible, so scheduling it becomes much less of a problem.

Ben Turner (not reading bugmail, use the needinfo flag!)

Comment 26

•

15 years ago

(In reply to comment #25) > Keep in mind that I have a patch up for review that makes the cycle collector > interruptible, so scheduling it becomes much less of a problem. Well... in an ideal world, yes. Right now we don't normally stay idle long enough to CC fully.

Brendan Eich [:brendan]

Comment 27

•

15 years ago

Probably not only on Mac OS X. We need to do something low-risk but effective for this bug in Firefox 4. I think we can, based on progress so far. /be

blocking2.0: --- → ?

OS: Mac OS X → All

Hardware: x86 → All

Brendan Eich [:brendan]

Comment 28

•

15 years ago

Interruptible CC is not the answer. It has enough unknowns to be "not Firefox 4" and as bent says, it didn't even help other hard cases because the CC was starved and had to run fully. It's a fine thing to experiment with for a later release. We have mistuning of existing tunables, and as Patrick observes complete lack of animation tuning as well. This can be more safely and locally fixed, I bet. /be

Patrick Walton (:pcwalton)

Reporter

Comment 29

•

15 years ago

So, without this patch, moving the mouse repeatedly over FOTN seems to make the CCs go away. Could we just make paints count as user interaction as a quick fix?

Alon Zakai (:azakai)

Comment 30

•

15 years ago

> I do think that we want to add some kind of animation heuristic. Perhaps a more general heuristic could be used? Animations aren't the only thing that is noticeable by users if GC pauses occur. Sound, realtime JS calculations (not necessarily rendered as animations), are other potential examples. It should be feasible to check how much the main thread was idle, for example (that is, how large the interval is between events running there). An animation would interrupt the main thread frequently, as would other timing-sensitive things. In other words, a GC pause will introduce a delay in the main thread, we can check based on recent activity how disruptive that delay would be.

Ben Turner (not reading bugmail, use the needinfo flag!)

Comment 31

•

15 years ago

(In reply to comment #30) > It should be feasible to check how much the main thread was idle, for example > (that is, how large the interval is between events running there). An animation > would interrupt the main thread frequently, as would other timing-sensitive > things. Our main thread is almost never idle right now, so this approach is not really going to help at the moment. In the future this might work though.

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 32

•

15 years ago

I was debugging FOTN a bit, and the reason it triggers CC quite often is that it actually creates plenty of suspected objects, and some objects are really garbage which needs to be collected occasionally. I think we could change nsJSContext::CC so that it doesn't always call GC. (It is GC which is slow in FOTN, not CC.) Some cycles are collectable even if the GC didn't run just before CC - in FOTN case this approach seems to work well. I also increased NS_MIN_SUSPECT_CHANGES and NS_MAX_SUSPECT_CHANGES. I need to now figure out how to build compartmental GC, and then run some more tests.

Alon Zakai (:azakai)

Comment 33

•

15 years ago

(In reply to comment #31) > (In reply to comment #30) > > It should be feasible to check how much the main thread was idle, for example > > (that is, how large the interval is between events running there). An animation > > would interrupt the main thread frequently, as would other timing-sensitive > > things. > > Our main thread is almost never idle right now, so this approach is not really > going to help at the moment. In the future this might work though. Is that true if the user is not generating interaction events? All the threads (including the main one) get almost entirely silent in that case, when I tested (on Fennec). There remain a few rare events, but it was less than 1 in a few seconds, whereas if an animation is running we get dozens of events each second - so the two cases are very easy to distinguish.

Brendan Eich [:brendan]

Comment 34

•

15 years ago

Compartment-wise GC is on. The good work Andreas et al. did to separate CC and GC will pay off here if it is indeed the case that there's little to cycle-collect. We need to avoid the global mark phase. We could even do a full GC before starting an intensive JS app or animiation to seed the gray bits. /be

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 35

•

15 years ago

Attached patch alternative approach — Details — Splinter Review

Patrick, want to try this? This makes us call cc usually without gc, but sometimes, if gc hasn't run, with gc. I'll investigate if we need some special handling for animation. Btw, on my machine CC takes about 7ms, GC several hundreds ms in FOTN.

Attachment #502903 - Flags: feedback?(pwalton)

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Updated

•

15 years ago

Attachment #502709 - Flags: feedback?(Olli.Pettay) → feedback-

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 36

•

15 years ago

(In reply to comment #35) > Created attachment 502903 [details] [diff] [review] Uploaded to tryserver.

Patrick Walton (:pcwalton)

Reporter

Updated

•

15 years ago

Attachment #502903 - Flags: feedback?(pwalton) → feedback+

Patrick Walton (:pcwalton)

Reporter

Comment 37

•

15 years ago

With this patch, all GCs due to cycle collection are gone in Flight of the Navigator. Awesome!

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 38

•

15 years ago

Oh, really. I still get some GCs, but apparently they are coming from jseng.

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 39

•

15 years ago

Er, right, you said "due to cycle collection". Yes, that is what I see. I'm now trying to figure out if there are some problematic cases for the approach.

Damon Sicore (:damons)

Comment 40

•

15 years ago

(In reply to comment #37) > With this patch, all GCs due to cycle collection are gone in Flight of the > Navigator. Awesome! Very, very, nice. Good job, guys. :)

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 41

•

15 years ago

There probably should be a forced GC at some point; maybe when user is inactive and there isn't any "animation".

Mike Shaver (:shaver emeritus)

Comment 42

•

15 years ago

We should force GC when we see task manager start up, or the user visits about:memory. (half-kidding)

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 43

•

15 years ago

I'll run Massif tomorrow to see how the patch behaves with the testcase from http://blog.mozilla.com/jseward/2011/01/07/space-profiling-the-browser/. It is possible that memory usage goes up way too much, but then it is time to tweak CC/GC scheduling parameters some more.

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 44

•

15 years ago

Attached patch + additional GC in DOMWindowUtils — Details — Splinter Review

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Updated

•

15 years ago

Attachment #503128 - Attachment description: + addition GC in DOMWindowUtils → + additional GC in DOMWindowUtils

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 45

•

15 years ago

There weren't any significant changes to perf test results in tryserver. I'm using the patch atm, and so far things seem to be ok. I have lots of tab open (over 200 or so), and when closing them, memory is released the way it should be and I don't see bad CC hiccups.

blocking2.0: ? → ---

OS: All → Mac OS X

Hardware: All → x86

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 46

•

15 years ago

Comment on attachment 503128 [details] [diff] [review] + additional GC in DOMWindowUtils Andreas, what do you think about this?

Attachment #503128 - Flags: review?(gal)

Andreas Gal :gal

Comment 47

•

15 years ago

Comment on attachment 503128 [details] [diff] [review] + additional GC in DOMWindowUtils Looks good. I would like to tune this further after 4. Where does the 5000 come from? Gut feeling or did we measure that? Patrick, can you try this patch and see what the lag looks like?

Attachment #503128 - Flags: review?(gal) → review+

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 48

•

15 years ago

(In reply to comment #47) > Where does the 5000 come > from? Gut feeling or did we measure that? Sort of both. I did test several common web apps at some point and tried to come up with constants which worked reasonably well then. But it was long ago, and both browser UI and web apps have changed. attachment 503128 [details] [diff] [review] is otherwise the same as previous patch, but I wanted to keep DOMWindowUtils::GarbageCollection behavior the same as before.

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 49

•

15 years ago

And this all is something which really needs some real world testing. There are just so many different kinds of web apps, and different ways to use a browser. It is possible that some tweaking is needed.

Patrick Walton (:pcwalton)

Reporter

Updated

•

15 years ago

Attachment #502709 - Attachment is obsolete: true

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Updated

•

15 years ago

Attachment #503128 - Flags: approval2.0?

Patrick Walton (:pcwalton)

Reporter

Comment 50

•

15 years ago

Just ran FOTN before and after. Before this patch, we have 38 GC pauses. With this patch applied, we have 18 GC pauses. That's a reduction in GC pause time of 54%. Moreover, with this patch, the pauses are concentrated in the part when the ship flies into the city, where more garbage is being created (I believe), which is what we would expect.

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 51

•

15 years ago

I landed the patch, but it seemed to trigger an unrelated null pointer crash in nsPresContext. Backed out. I'll file a bug for the prescontext bug.

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 52

•

15 years ago

(In reply to comment #51) > it seemed to trigger an unrelated null pointer crash > in nsPresContext. Hmm, not quite sure about the crash yet.

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Updated

•

15 years ago

Depends on: 625311

:Ms2ger (he/him; ⌚ UTC+1/+2)

Comment 53

•

15 years ago

FTR: http://hg.mozilla.org/mozilla-central/rev/54184cfa6f0e http://hg.mozilla.org/mozilla-central/rev/42a1bfa0dfd1

Version: unspecified → Trunk

Phil Ringnalda (:philor)

Comment 54

•

15 years ago

I'm also puzzled by bug 561007 - it failed a couple of times last spring, then nothing until October 12th when it failed three times (two of which you starred), then nothing until while this was in and it failed another five or so times.

Phil Ringnalda (:philor)

Comment 55

•

15 years ago

And, incomprehensible though it is, js1_5/extensions/regress-452168.js timed out only on OS X 10.5 debug jsreftests, both times.

David Humphrey (:humph)

Updated

•

15 years ago

Blocks: 625871

David Humphrey (:humph)

Comment 56

•

15 years ago

We've been using this patch a lot to build or ff4 demo, and it's a work of art perf wise. Not sure where what people think about getting it in for 4. Any update on issues with backout?

Patrick Walton (:pcwalton)

Reporter

Comment 57

•

15 years ago

(In reply to comment #56) > We've been using this patch a lot to build or ff4 demo, and it's a work of art > perf wise. Not sure where what people think about getting it in for 4. Any > update on issues with backout? I believe the consensus was to just try landing it again?

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 58

•

15 years ago

I've tried to run it again today with some other relevant patches. There is still one test failure though.

Patrick Walton (:pcwalton)

Reporter

Comment 59

•

15 years ago

What's the failure?

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 60

•

15 years ago

TEST-UNEXPECTED-FAIL | chrome://mochitests/content/chrome/toolkit/components/places/tests/chrome/test_381357.xul | Test timed out. Haven't managed to reproduce that locally, but it happens quite reliably on tryserver.

Robert Sayre

Comment 61

•

15 years ago

should this be approved given the test failure?

Andreas Gal :gal

Comment 62

•

15 years ago

Approval or not, it can't land until the regression is fixed. smaug? patrick?

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 63

•

15 years ago

Working on it.

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 64

•

14 years ago

Just posted a bit less regression risky patch to tryserver (with Bug 614347)

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 65

•

14 years ago

Attached patch v4 — Details — Splinter Review

With this the failure doesn't happen so often. Locally I still haven't managed to reproduce the problem even with the original patch. Note, the problem has happened earlier, even without the patch, but something triggers it to happen a lot more often when CC/GC handling is changed. This patch is anyway less regression risky, since it doesn't change page load handling. Need to still investigate this some more... Have to figure out what in the test is problematic. (though, I'm not 100% sure which test causes the problem)

Brendan Eich [:brendan]

Comment 66

•

14 years ago

The test is timing out, and it uses places (livemarks). And the bug pre-exists, just is easier to repro with the patch that we want to land. I suspect the native code involved is to blame, and unless the patch breaks some non-test user-facing scenario, we benefit more by taking this bug's patch and doing something to get the test out of our way. We should find and fix the bug, where ever it may be, but we need to ship Firefox 4 and we would like to reduce these CC pauses. Cc'ing some big guns to see whether they agree. /be

Robert Sayre

Comment 67

•

14 years ago

disabling the test seems ok, if we are sure which test is acting up.

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 68

•

14 years ago

Sayrer, it is your test ;) and yes, disabling that one particular test does seem to make tests green. But I'm still trying to understand why it fails occasionally.

Laura Mesa [:lmesa] [:lvmesa]

Comment 69

•

14 years ago

Just an fyi, but this bug is blocking one of the demos slated for the fx 4 release.

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 70

•

14 years ago

So far I haven't figured out what causes the problem. I can't reproduce the problem locally. Something somewhere down deep under reloadLivemarkFolder() fails. I guess we could disable that test for now, take the patch, and I could continue to debug why test_381357.xul fails occasionally (rarer without the patch). Debugging using just tryserver and code inspection is slow. Changing CC/GC scheduling is rather scary and certainly needs time for real world testing before the release, and we should be ready to back the patch out it if causes problems.

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Updated

•

14 years ago

Attachment #503128 - Flags: approval2.0?

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 71

•

14 years ago

Comment on attachment 506534 [details] [diff] [review] v4 Andreas, what you think of this. This one doesn't change the CC handling during page loads, so it should be less-risky. I would need to disable that one test for now. (like in http://hg.mozilla.org/try/rev/5620932f1e10)

Attachment #506534 - Flags: review?(gal)

Brendan Eich [:brendan]

Comment 72

•

14 years ago

sayrer already said ok to disable test. Followup bug filed and cited please. I can double-authorize if you need it. /be

Andreas Gal :gal

Comment 73

•

14 years ago

Comment on attachment 506534 [details] [diff] [review] v4 >diff --git a/dom/base/nsDOMWindowUtils.cpp b/dom/base/nsDOMWindowUtils.cpp >--- a/dom/base/nsDOMWindowUtils.cpp >+++ b/dom/base/nsDOMWindowUtils.cpp >@@ -650,16 +650,19 @@ nsDOMWindowUtils::GarbageCollect(nsICycl > { > // Always permit this in debug builds. > #ifndef DEBUG > if (!IsUniversalXPConnectCapable()) { > return NS_ERROR_DOM_SECURITY_ERR; > } > #endif > >+ if (nsContentUtils::XPConnect()) { >+ nsContentUtils::XPConnect()->GarbageCollect(); >+ } Why are we adding a mandatory GC back here? That doubles the latency of this call? > // nsCycleCollector_collect() no longer forces a JS garbage collection, > // so we have to do it ourselves here. >- if (nsContentUtils::XPConnect()) { >+ if (nsContentUtils::XPConnect() && >+ (aForceGC || >+ (!GetGCRunsSinceLastCC() && Why do we care whether the GC has run since the last CC?

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 74

•

14 years ago

(In reply to comment #73) > Comment on attachment 506534 [details] [diff] [review] > v4 > > >diff --git a/dom/base/nsDOMWindowUtils.cpp b/dom/base/nsDOMWindowUtils.cpp > >--- a/dom/base/nsDOMWindowUtils.cpp > >+++ b/dom/base/nsDOMWindowUtils.cpp > >@@ -650,16 +650,19 @@ nsDOMWindowUtils::GarbageCollect(nsICycl > > { > > // Always permit this in debug builds. > > #ifndef DEBUG > > if (!IsUniversalXPConnectCapable()) { > > return NS_ERROR_DOM_SECURITY_ERR; > > } > > #endif > > > >+ if (nsContentUtils::XPConnect()) { > >+ nsContentUtils::XPConnect()->GarbageCollect(); > >+ } > > Why are we adding a mandatory GC back here? That doubles the latency of this > call? So that all our tests using nsDOMWindowUtils::GarbageCollect work the way they used to. Actually, I can now just call ::CC with force gc parameter. > Why do we care whether the GC has run since the last CC? We used to call GC, while it was still part of CC.

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 75

•

14 years ago

Attached patch v6 — Details — Splinter Review

Attachment #507845 - Flags: review?(gal)

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Updated

•

14 years ago

Attachment #506534 - Flags: review?(gal)

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Updated

•

14 years ago

Depends on: 561007

Andreas Gal :gal

Comment 76

•

14 years ago

Comment on attachment 507845 [details] [diff] [review] v6 Thanks for spending so much time on this smaug. Awesome that we got a fix for this for FF4.

Attachment #507845 - Flags: review?(gal) → review+

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Updated

•

14 years ago

Attachment #507845 - Flags: approval2.0?

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Updated

•

14 years ago

Assignee: pwalton → Olli.Pettay

Johnny Stenback (:jst)

Updated

•

14 years ago

Attachment #507845 - Flags: approval2.0? → approval2.0+

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 77

•

14 years ago

http://hg.mozilla.org/mozilla-central/rev/3ea2b5a7c9c8

Status: NEW → RESOLVED

Closed: 14 years ago

Resolution: --- → FIXED

Boris Zbarsky [:bzbarsky]

Comment 78

•

14 years ago

For what it's worth, this caused 10% regression in the Dromaeo (CSS) test on Mac 10.6. Not sure what that translates to in terms of the "real" dromaeo test.

Andreas Gal :gal

Comment 79

•

14 years ago

Thats bad :( We should find out why.

Andreas Gal :gal

Updated

•

14 years ago

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 80

•

14 years ago

Yeah, changes to perf results are very likely when tweaking CC or GC scheduling :(

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 81

•

14 years ago

Should the patch be backed out?

Mike Shaver (:shaver emeritus)

Comment 82

•

14 years ago

This bug is fixed, until it *is* backed out. A separate bug for the regression seems better than using comments 80+ for it here; if the conclusion is to back it out, then we can reopen, but IMO it's hard to imagine wanting 10% on one dromaeo subtest more than the responsiveness wins here.

Status: REOPENED → RESOLVED

Closed: 14 years ago → 14 years ago

Resolution: --- → FIXED

Brendan Eich [:brendan]

Comment 83

•

14 years ago

Agree with shaver -- dromaeo has enough issues and it's the tail that ought not wag the dog here. /be

Brendan Eich [:brendan]

Comment 84

•

14 years ago

Do file and investigate as time allows, for sure. Tails can tell dogs to jump when doors are slamming, so we want to know if it's a heavy front door or a light screen door (or slight motion of air) before too long. /be

Andreas Gal :gal

Comment 85

•

14 years ago

Yeah lets leave the patch in I concur but we need a bug to investigate this. I can work on that bug with patrick. smaug want to file it?

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Updated

•

14 years ago

Depends on: 630156

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Updated

•

14 years ago

Depends on: 614347

Johnny Stenback (:jst)

Updated

•

14 years ago

No longer depends on: 614347

Johnny Stenback (:jst)

Updated

•

14 years ago

Depends on: 614347

Boris Zbarsky [:bzbarsky]

Updated

•

14 years ago

Depends on: 630877

Andreas Gal :gal

Comment 86

•

14 years ago

Attached patch API for per-compartment GCs — Details — Splinter Review

Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1)

Assignee

Comment 87

•

14 years ago

That patch was for Bug 624867, I assume.

Peter Van der Beken [:peterv]

Updated

•

14 years ago

Depends on: 630947

Boris Zbarsky [:bzbarsky]

Updated

•

14 years ago

Depends on: 631494

Nobody; OK to take it and work on it

Updated

•

6 years ago

Component: DOM → DOM: Core & HTML

Proposed patch. 15 years ago Patrick Walton (:pcwalton) 22.94 KB, patch	gal : feedback+ smaug : feedback-	Details \| Diff \| Splinter Review
a testcase for CC 15 years ago Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1) 1.36 KB, text/html		Details
alternative approach 15 years ago Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1) 4.63 KB, patch	pcwalton : feedback+	Details \| Diff \| Splinter Review
+ additional GC in DOMWindowUtils 15 years ago Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1) 5.18 KB, patch	gal : review+	Details \| Diff \| Splinter Review
v4 14 years ago Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1) 9.23 KB, patch		Details \| Diff \| Splinter Review
v6 14 years ago Olli Pettay [:smaug][bugs@pettay.fi] (vacation-ish -> Aug 1) 9.12 KB, patch	gal : review+ jst : approval2.0+	Details \| Diff \| Splinter Review
API for per-compartment GCs 14 years ago Andreas Gal :gal 1.42 KB, patch		Details \| Diff \| Splinter Review