Closed Bug 822849 Opened 7 years ago Closed 7 years ago

Timer based CC occurring on shutdown

Categories

(Core :: XPConnect, defect)

x86
macOS
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla21
Tracking Status
relnote-firefox --- 19+

People

(Reporter: BenWa, Assigned: smaug)

References

(Blocks 1 open bug)

Details

(Whiteboard: [Snappy:P1][sps])

Attachments

(1 file, 5 obsolete files)

Profile:
http://people.mozilla.com/~bgirard/cleopatra/#report=fde52ea009cdee2fbc96ca655d830cc267e137d7&search=CC%253A%253A

This profile was collected on Linux so we don't have full unwind info. I'll keep an eye for a report of this issue on win/mac where we can get more info.
This is scary. Even from a leak-debugability  point of view.
(In reply to Rafael Ávila de Espíndola (:espindola) from comment #1)
> This is scary. Even from a leak-debugability  point of view.

Why is it 'scary' other then the obvious 'it causes us to do more work'?
(In reply to Benoit Girard (:BenWa) from comment #2)
> (In reply to Rafael Ávila de Espíndola (:espindola) from comment #1)
> > This is scary. Even from a leak-debugability  point of view.
> 
> Why is it 'scary' other then the obvious 'it causes us to do more work'?

Imagine a leak that goes away with one more cc pass. We get a leak or not depending on the timer firing.
If you care about catching leaks, you are going to enable the shutdown CCs anyways, so it shouldn't be a problem. The shutdown CC happens after the write poisoning, and we never do any CC after that.
Here's an example on windows with backtraces. Even if time-wise it's negligible:
http://people.mozilla.com/~bgirard/cleopatra/#report=ae59673842182a9aec95e4c3af04fdaaedab386a&search=CC::
We also have timers for InterSliceGCTimerFired(nsITimer*, void*) which is call on shutdown. As agreed in Comment 3 this type of timer based behavior is very bad by giving us slower and non deterministic shutdown behavior.
I'll look at this
Assignee: nobody → bugs
Attached patch Part 1: No Timer CC on shutdown (obsolete) — Splinter Review
Attachment #704661 - Flags: review?(continuation)
Attached patch Part 2: No timer GC on shutdown (obsolete) — Splinter Review
Attachment #704671 - Flags: review?(continuation)
Attached patch Part 2: No timer GC on shutdown (obsolete) — Splinter Review
Attachment #704671 - Attachment is obsolete: true
Attachment #704671 - Flags: review?(continuation)
Attachment #704672 - Flags: review?(continuation)
Attachment #704661 - Attachment description: Part 1: No CC on shutdown → Part 1: No Timer CC on shutdown
Whaat, I'm just about to push my patch to try.
Comment on attachment 704672 [details] [diff] [review]
Part 2: No timer GC on shutdown

This is ugly. We should just observe the right notification.
Attachment #704672 - Flags: review?(continuation) → review-
Attachment #704661 - Flags: review?(continuation) → review-
Attached patch patch (obsolete) — Splinter Review
https://tbpl.mozilla.org/?tree=Try&rev=328972e9d89d

We can change the notification to observe easily.
Attachment #704682 - Flags: feedback?(bgirard)
Yeah, Olli's probably a better reviewer. I try to do as little as possible with CC timing. ;)
Comment on attachment 704682 [details] [diff] [review]
patch

Review of attachment 704682 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/base/nsJSEnvironment.cpp
@@ +226,2 @@
>  {
> +  if (sGCOnMemoryPressure && !nsCRT::strcmp(aTopic, "memory-pressure")) {

NIT: can you replace !strcmp() with strcmp() == 0. To me this reads as "Does not strcmp to memory-pressure.

@@ +229,5 @@
>                                     nsJSContext::NonIncrementalGC,
>                                     nsJSContext::NonCompartmentGC,
>                                     nsJSContext::ShrinkingGC);
>      nsJSContext::CycleCollectNow();
> +  } else if (!nsCRT::strcmp(aTopic, "quit-application-granted")) {

Likewise.
Attachment #704682 - Flags: feedback?(bgirard) → feedback+
Attachment #704661 - Attachment is obsolete: true
Attachment #704672 - Attachment is obsolete: true
> NIT: can you replace !strcmp() with strcmp() == 0.
!nsCRT::strcmp is used pretty much everywhere in modern DOM code when dealing with notification topics.
This would need some mozilla::PoisonWrite(); testing, or does that run on debug-OSX builds when
running tests on try?
(In reply to Olli Pettay [:smaug] from comment #16)
> > NIT: can you replace !strcmp() with strcmp() == 0.
> !nsCRT::strcmp is used pretty much everywhere in modern DOM code when
> dealing with notification topics.

Alright that's sad but ok.

(In reply to Olli Pettay [:smaug] from comment #17)
> This would need some mozilla::PoisonWrite(); testing, or does that run on
> debug-OSX builds when
> running tests on try?

I don't think so. Any bad writes will crash on debug-OSX/win.
(In reply to Benoit Girard (:BenWa) from comment #18)
> I don't think so. Any bad writes will crash on debug-OSX/win.
Even on try? Or does mozilla::PoisonWrite() require some special build?
Though, if someone relies one these GCs or CCs, it would be clearly a bug in such code.
The behavior for platforms with PoisonWrite supported (win/osx):
Non-debug: If telemetry is enabled they will be reported
Debug: Crash on a shutdown write after a certain point.
Attached patch quit-application (obsolete) — Splinter Review
Need to use different notification, since in theory new windows can be opened
after *-granted.
https://tbpl.mozilla.org/?tree=Try&rev=7cd20635e0f5
Attachment #704682 - Attachment is obsolete: true
Attachment #704709 - Flags: review?(continuation)
Comment on attachment 704709 [details] [diff] [review]
quit-application

Review of attachment 704709 [details] [diff] [review]:
-----------------------------------------------------------------

I don't know enough about the shutdown sequence to know if "quit-application" is a good place to kill the GCs and CCs, but assuming that it is, this looks good to me.

::: dom/base/nsJSEnvironment.cpp
@@ +230,5 @@
>                                     nsJSContext::NonCompartmentGC,
>                                     nsJSContext::ShrinkingGC);
>      nsJSContext::CycleCollectNow();
> +  } else if (!nsCRT::strcmp(aTopic, "quit-application")) {
> +    sShuttingDown = true;

Do we want to set sShuttingDown to true in nsJSRuntime::Shutdown, too?

@@ +231,5 @@
>                                     nsJSContext::ShrinkingGC);
>      nsJSContext::CycleCollectNow();
> +  } else if (!nsCRT::strcmp(aTopic, "quit-application")) {
> +    sShuttingDown = true;
> +    nsJSContext::KillGCTimer();

Please factor these Kill*Timer functions into a KillTimer() function and call it here and in nsJSRuntime. It can just be a static function if you don't want to touch the header for nsJSContext.

@@ +3263,5 @@
>  
>    // The GC has more work to do, so schedule another GC slice.
>    if (aProgress == js::GC_SLICE_END) {
>      nsJSContext::KillInterSliceGCTimer();
> +    if (!sShuttingDown) {

It seems a little weird to me that we will just stop in the middle of an incremental GC, but I guess it shouldn't matter.
Attachment #704709 - Flags: review?(continuation) → review+
Attachment #704709 - Attachment is obsolete: true
I'm labeling this as a Snappy:P1 because the profiler' shutdown performance reporter profile seems to indicate that this fix will give us a big win!
Whiteboard: [Snappy:P1][sps]
https://hg.mozilla.org/mozilla-central/rev/2876e73c9b6f
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
After this bug 833143 may start to show up more badly in the profiles, since there is probably
more to collect.
Target Milestone: --- → mozilla21
(In reply to Benoit Girard (:BenWa) from comment #25)
> I'm labeling this as a Snappy:P1 because the profiler' shutdown performance
> reporter profile seems to indicate that this fix will give us a big win!

Do we have any hard numbers we can include as part of the notes ("up to X% improvement in shutdown time")?
Flags: needinfo?(bjacob)
This isn't too important change comparing to bug 818739.
(In reply to Alex Keybl [:akeybl] from comment #28)
> (In reply to Benoit Girard (:BenWa) from comment #25)
> > I'm labeling this as a Snappy:P1 because the profiler' shutdown performance
> > reporter profile seems to indicate that this fix will give us a big win!
> 
> Do we have any hard numbers we can include as part of the notes ("up to X%
> improvement in shutdown time")?

Not from this particular patch but overall in Firefox21 Telemetry shows us to shutdown about 50% faster. bug 818739 is responsible for most of the win.
Flags: needinfo?(bjacob)
bug 818739 is already in Aurora, so FF20 ;)
I had this problem today while investigating an issue with Olli. Sadly I run Aurora 20.0a2. I will keep an eye on it once I upgraded to 21.0a2.

Here the profiler graph which shows that shutdown was blocked by about additional 20s.
You need to log in before you can comment on or make changes to this bug.