Last Comment Bug 822849 - Timer based CC occurring on shutdown
: Timer based CC occurring on shutdown
Status: RESOLVED FIXED
[Snappy:P1][sps]
:
Product: Core
Classification: Components
Component: XPConnect (show other bugs)
: Trunk
: x86 Mac OS X
: -- normal (vote)
: mozilla21
Assigned To: Olli Pettay [:smaug]
:
: Andrew Overholt [:overholt]
Mentors:
Depends on:
Blocks: shutdown-faster
  Show dependency treegraph
 
Reported: 2012-12-18 14:47 PST by Benoit Girard (:BenWa)
Modified: 2013-02-20 04:48 PST (History)
12 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
19+


Attachments
Part 1: No Timer CC on shutdown (1.00 KB, patch)
2013-01-21 12:51 PST, Benoit Girard (:BenWa)
bugs: review-
Details | Diff | Splinter Review
Part 2: No timer GC on shutdown (1.00 KB, patch)
2013-01-21 13:36 PST, Benoit Girard (:BenWa)
no flags Details | Diff | Splinter Review
Part 2: No timer GC on shutdown (2.35 KB, patch)
2013-01-21 13:37 PST, Benoit Girard (:BenWa)
bugs: review-
Details | Diff | Splinter Review
patch (7.09 KB, patch)
2013-01-21 14:06 PST, Olli Pettay [:smaug]
b56girard: feedback+
Details | Diff | Splinter Review
quit-application (7.08 KB, patch)
2013-01-21 16:02 PST, Olli Pettay [:smaug]
continuation: review+
Details | Diff | Splinter Review
comments addressed (7.93 KB, patch)
2013-01-22 11:16 PST, Olli Pettay [:smaug]
continuation: review+
Details | Diff | Splinter Review

Description Benoit Girard (:BenWa) 2012-12-18 14:47:28 PST
Profile:
http://people.mozilla.com/~bgirard/cleopatra/#report=fde52ea009cdee2fbc96ca655d830cc267e137d7&search=CC%253A%253A

This profile was collected on Linux so we don't have full unwind info. I'll keep an eye for a report of this issue on win/mac where we can get more info.
Comment 1 Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2012-12-18 14:50:41 PST
This is scary. Even from a leak-debugability  point of view.
Comment 2 Benoit Girard (:BenWa) 2012-12-18 14:57:41 PST
(In reply to Rafael Ávila de Espíndola (:espindola) from comment #1)
> This is scary. Even from a leak-debugability  point of view.

Why is it 'scary' other then the obvious 'it causes us to do more work'?
Comment 3 Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2012-12-18 14:58:41 PST
(In reply to Benoit Girard (:BenWa) from comment #2)
> (In reply to Rafael Ávila de Espíndola (:espindola) from comment #1)
> > This is scary. Even from a leak-debugability  point of view.
> 
> Why is it 'scary' other then the obvious 'it causes us to do more work'?

Imagine a leak that goes away with one more cc pass. We get a leak or not depending on the timer firing.
Comment 4 Andrew McCreight [:mccr8] 2012-12-18 15:10:21 PST
If you care about catching leaks, you are going to enable the shutdown CCs anyways, so it shouldn't be a problem. The shutdown CC happens after the write poisoning, and we never do any CC after that.
Comment 5 Benoit Girard (:BenWa) 2012-12-18 16:02:35 PST
Here's an example on windows with backtraces. Even if time-wise it's negligible:
http://people.mozilla.com/~bgirard/cleopatra/#report=ae59673842182a9aec95e4c3af04fdaaedab386a&search=CC::
Comment 6 Benoit Girard (:BenWa) 2013-01-18 15:31:59 PST
We also have timers for InterSliceGCTimerFired(nsITimer*, void*) which is call on shutdown. As agreed in Comment 3 this type of timer based behavior is very bad by giving us slower and non deterministic shutdown behavior.
Comment 7 Olli Pettay [:smaug] 2013-01-18 15:34:27 PST
I'll look at this
Comment 8 Benoit Girard (:BenWa) 2013-01-21 12:51:08 PST
Created attachment 704661 [details] [diff] [review]
Part 1: No Timer CC on shutdown
Comment 9 Benoit Girard (:BenWa) 2013-01-21 13:36:33 PST
Created attachment 704671 [details] [diff] [review]
Part 2: No timer GC on shutdown
Comment 10 Benoit Girard (:BenWa) 2013-01-21 13:37:41 PST
Created attachment 704672 [details] [diff] [review]
Part 2: No timer GC on shutdown
Comment 11 Olli Pettay [:smaug] 2013-01-21 13:38:40 PST
Whaat, I'm just about to push my patch to try.
Comment 12 Olli Pettay [:smaug] 2013-01-21 13:41:11 PST
Comment on attachment 704672 [details] [diff] [review]
Part 2: No timer GC on shutdown

This is ugly. We should just observe the right notification.
Comment 13 Olli Pettay [:smaug] 2013-01-21 14:06:08 PST
Created attachment 704682 [details] [diff] [review]
patch

https://tbpl.mozilla.org/?tree=Try&rev=328972e9d89d

We can change the notification to observe easily.
Comment 14 Andrew McCreight [:mccr8] 2013-01-21 14:21:16 PST
Yeah, Olli's probably a better reviewer. I try to do as little as possible with CC timing. ;)
Comment 15 Benoit Girard (:BenWa) 2013-01-21 14:49:18 PST
Comment on attachment 704682 [details] [diff] [review]
patch

Review of attachment 704682 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/base/nsJSEnvironment.cpp
@@ +226,2 @@
>  {
> +  if (sGCOnMemoryPressure && !nsCRT::strcmp(aTopic, "memory-pressure")) {

NIT: can you replace !strcmp() with strcmp() == 0. To me this reads as "Does not strcmp to memory-pressure.

@@ +229,5 @@
>                                     nsJSContext::NonIncrementalGC,
>                                     nsJSContext::NonCompartmentGC,
>                                     nsJSContext::ShrinkingGC);
>      nsJSContext::CycleCollectNow();
> +  } else if (!nsCRT::strcmp(aTopic, "quit-application-granted")) {

Likewise.
Comment 16 Olli Pettay [:smaug] 2013-01-21 14:52:32 PST
> NIT: can you replace !strcmp() with strcmp() == 0.
!nsCRT::strcmp is used pretty much everywhere in modern DOM code when dealing with notification topics.
Comment 17 Olli Pettay [:smaug] 2013-01-21 14:55:43 PST
This would need some mozilla::PoisonWrite(); testing, or does that run on debug-OSX builds when
running tests on try?
Comment 18 Benoit Girard (:BenWa) 2013-01-21 15:30:12 PST
(In reply to Olli Pettay [:smaug] from comment #16)
> > NIT: can you replace !strcmp() with strcmp() == 0.
> !nsCRT::strcmp is used pretty much everywhere in modern DOM code when
> dealing with notification topics.

Alright that's sad but ok.

(In reply to Olli Pettay [:smaug] from comment #17)
> This would need some mozilla::PoisonWrite(); testing, or does that run on
> debug-OSX builds when
> running tests on try?

I don't think so. Any bad writes will crash on debug-OSX/win.
Comment 19 Olli Pettay [:smaug] 2013-01-21 15:33:35 PST
(In reply to Benoit Girard (:BenWa) from comment #18)
> I don't think so. Any bad writes will crash on debug-OSX/win.
Even on try? Or does mozilla::PoisonWrite() require some special build?
Comment 20 Olli Pettay [:smaug] 2013-01-21 15:38:20 PST
Though, if someone relies one these GCs or CCs, it would be clearly a bug in such code.
Comment 21 Benoit Girard (:BenWa) 2013-01-21 15:39:29 PST
The behavior for platforms with PoisonWrite supported (win/osx):
Non-debug: If telemetry is enabled they will be reported
Debug: Crash on a shutdown write after a certain point.
Comment 22 Olli Pettay [:smaug] 2013-01-21 16:02:30 PST
Created attachment 704709 [details] [diff] [review]
quit-application

Need to use different notification, since in theory new windows can be opened
after *-granted.
https://tbpl.mozilla.org/?tree=Try&rev=7cd20635e0f5
Comment 23 Andrew McCreight [:mccr8] 2013-01-22 10:39:46 PST
Comment on attachment 704709 [details] [diff] [review]
quit-application

Review of attachment 704709 [details] [diff] [review]:
-----------------------------------------------------------------

I don't know enough about the shutdown sequence to know if "quit-application" is a good place to kill the GCs and CCs, but assuming that it is, this looks good to me.

::: dom/base/nsJSEnvironment.cpp
@@ +230,5 @@
>                                     nsJSContext::NonCompartmentGC,
>                                     nsJSContext::ShrinkingGC);
>      nsJSContext::CycleCollectNow();
> +  } else if (!nsCRT::strcmp(aTopic, "quit-application")) {
> +    sShuttingDown = true;

Do we want to set sShuttingDown to true in nsJSRuntime::Shutdown, too?

@@ +231,5 @@
>                                     nsJSContext::ShrinkingGC);
>      nsJSContext::CycleCollectNow();
> +  } else if (!nsCRT::strcmp(aTopic, "quit-application")) {
> +    sShuttingDown = true;
> +    nsJSContext::KillGCTimer();

Please factor these Kill*Timer functions into a KillTimer() function and call it here and in nsJSRuntime. It can just be a static function if you don't want to touch the header for nsJSContext.

@@ +3263,5 @@
>  
>    // The GC has more work to do, so schedule another GC slice.
>    if (aProgress == js::GC_SLICE_END) {
>      nsJSContext::KillInterSliceGCTimer();
> +    if (!sShuttingDown) {

It seems a little weird to me that we will just stop in the middle of an incremental GC, but I guess it shouldn't matter.
Comment 24 Olli Pettay [:smaug] 2013-01-22 11:16:56 PST
Created attachment 705014 [details] [diff] [review]
comments addressed
Comment 25 Benoit Girard (:BenWa) 2013-01-22 12:10:21 PST
I'm labeling this as a Snappy:P1 because the profiler' shutdown performance reporter profile seems to indicate that this fix will give us a big win!
Comment 26 Olli Pettay [:smaug] 2013-01-22 12:35:57 PST
https://hg.mozilla.org/mozilla-central/rev/2876e73c9b6f
Comment 27 Olli Pettay [:smaug] 2013-01-22 12:44:57 PST
After this bug 833143 may start to show up more badly in the profiles, since there is probably
more to collect.
Comment 28 Alex Keybl [:akeybl] 2013-02-15 13:34:50 PST
(In reply to Benoit Girard (:BenWa) from comment #25)
> I'm labeling this as a Snappy:P1 because the profiler' shutdown performance
> reporter profile seems to indicate that this fix will give us a big win!

Do we have any hard numbers we can include as part of the notes ("up to X% improvement in shutdown time")?
Comment 29 Olli Pettay [:smaug] 2013-02-15 13:39:44 PST
This isn't too important change comparing to bug 818739.
Comment 30 Benoit Girard (:BenWa) 2013-02-15 13:47:46 PST
(In reply to Alex Keybl [:akeybl] from comment #28)
> (In reply to Benoit Girard (:BenWa) from comment #25)
> > I'm labeling this as a Snappy:P1 because the profiler' shutdown performance
> > reporter profile seems to indicate that this fix will give us a big win!
> 
> Do we have any hard numbers we can include as part of the notes ("up to X%
> improvement in shutdown time")?

Not from this particular patch but overall in Firefox21 Telemetry shows us to shutdown about 50% faster. bug 818739 is responsible for most of the win.
Comment 31 Olli Pettay [:smaug] 2013-02-15 13:56:20 PST
bug 818739 is already in Aurora, so FF20 ;)
Comment 32 Henrik Skupin (:whimboo) 2013-02-20 04:48:03 PST
I had this problem today while investigating an issue with Olli. Sadly I run Aurora 20.0a2. I will keep an eye on it once I upgraded to 21.0a2.

Here the profiler graph which shows that shutdown was blocked by about additional 20s.

Note You need to log in before you can comment on or make changes to this bug.