Timer based CC occurring on shutdown

RESOLVED FIXED in mozilla21

Status

()

Core
XPConnect
RESOLVED FIXED
5 years ago
4 years ago

People

(Reporter: BenWa, Assigned: smaug)

Tracking

(Blocks: 1 bug)

Trunk
mozilla21
x86
Mac OS X
Points:
---

Firefox Tracking Flags

(relnote-firefox 19+)

Details

(Whiteboard: [Snappy:P1][sps])

Attachments

(1 attachment, 5 obsolete attachments)

(Reporter)

Description

5 years ago
Profile:
http://people.mozilla.com/~bgirard/cleopatra/#report=fde52ea009cdee2fbc96ca655d830cc267e137d7&search=CC%253A%253A

This profile was collected on Linux so we don't have full unwind info. I'll keep an eye for a report of this issue on win/mac where we can get more info.
This is scary. Even from a leak-debugability  point of view.
(Reporter)

Comment 2

5 years ago
(In reply to Rafael Ávila de Espíndola (:espindola) from comment #1)
> This is scary. Even from a leak-debugability  point of view.

Why is it 'scary' other then the obvious 'it causes us to do more work'?
(In reply to Benoit Girard (:BenWa) from comment #2)
> (In reply to Rafael Ávila de Espíndola (:espindola) from comment #1)
> > This is scary. Even from a leak-debugability  point of view.
> 
> Why is it 'scary' other then the obvious 'it causes us to do more work'?

Imagine a leak that goes away with one more cc pass. We get a leak or not depending on the timer firing.
If you care about catching leaks, you are going to enable the shutdown CCs anyways, so it shouldn't be a problem. The shutdown CC happens after the write poisoning, and we never do any CC after that.
(Reporter)

Comment 5

5 years ago
Here's an example on windows with backtraces. Even if time-wise it's negligible:
http://people.mozilla.com/~bgirard/cleopatra/#report=ae59673842182a9aec95e4c3af04fdaaedab386a&search=CC::
(Reporter)

Comment 6

5 years ago
We also have timers for InterSliceGCTimerFired(nsITimer*, void*) which is call on shutdown. As agreed in Comment 3 this type of timer based behavior is very bad by giving us slower and non deterministic shutdown behavior.
(Assignee)

Comment 7

5 years ago
I'll look at this
Assignee: nobody → bugs
(Reporter)

Comment 8

5 years ago
Created attachment 704661 [details] [diff] [review]
Part 1: No Timer CC on shutdown
(Reporter)

Updated

5 years ago
Attachment #704661 - Flags: review?(continuation)
(Reporter)

Comment 9

5 years ago
Created attachment 704671 [details] [diff] [review]
Part 2: No timer GC on shutdown
Attachment #704671 - Flags: review?(continuation)
(Reporter)

Comment 10

5 years ago
Created attachment 704672 [details] [diff] [review]
Part 2: No timer GC on shutdown
Attachment #704671 - Attachment is obsolete: true
Attachment #704671 - Flags: review?(continuation)
Attachment #704672 - Flags: review?(continuation)
(Reporter)

Updated

5 years ago
Attachment #704661 - Attachment description: Part 1: No CC on shutdown → Part 1: No Timer CC on shutdown
(Assignee)

Comment 11

5 years ago
Whaat, I'm just about to push my patch to try.
(Assignee)

Comment 12

5 years ago
Comment on attachment 704672 [details] [diff] [review]
Part 2: No timer GC on shutdown

This is ugly. We should just observe the right notification.
Attachment #704672 - Flags: review?(continuation) → review-
(Assignee)

Updated

5 years ago
Attachment #704661 - Flags: review?(continuation) → review-
(Assignee)

Comment 13

5 years ago
Created attachment 704682 [details] [diff] [review]
patch

https://tbpl.mozilla.org/?tree=Try&rev=328972e9d89d

We can change the notification to observe easily.
Attachment #704682 - Flags: feedback?(bgirard)
Yeah, Olli's probably a better reviewer. I try to do as little as possible with CC timing. ;)
(Reporter)

Comment 15

5 years ago
Comment on attachment 704682 [details] [diff] [review]
patch

Review of attachment 704682 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/base/nsJSEnvironment.cpp
@@ +226,2 @@
>  {
> +  if (sGCOnMemoryPressure && !nsCRT::strcmp(aTopic, "memory-pressure")) {

NIT: can you replace !strcmp() with strcmp() == 0. To me this reads as "Does not strcmp to memory-pressure.

@@ +229,5 @@
>                                     nsJSContext::NonIncrementalGC,
>                                     nsJSContext::NonCompartmentGC,
>                                     nsJSContext::ShrinkingGC);
>      nsJSContext::CycleCollectNow();
> +  } else if (!nsCRT::strcmp(aTopic, "quit-application-granted")) {

Likewise.
Attachment #704682 - Flags: feedback?(bgirard) → feedback+
(Reporter)

Updated

5 years ago
Attachment #704661 - Attachment is obsolete: true
(Reporter)

Updated

5 years ago
Attachment #704672 - Attachment is obsolete: true
(Assignee)

Comment 16

5 years ago
> NIT: can you replace !strcmp() with strcmp() == 0.
!nsCRT::strcmp is used pretty much everywhere in modern DOM code when dealing with notification topics.
(Assignee)

Comment 17

5 years ago
This would need some mozilla::PoisonWrite(); testing, or does that run on debug-OSX builds when
running tests on try?
(Reporter)

Comment 18

5 years ago
(In reply to Olli Pettay [:smaug] from comment #16)
> > NIT: can you replace !strcmp() with strcmp() == 0.
> !nsCRT::strcmp is used pretty much everywhere in modern DOM code when
> dealing with notification topics.

Alright that's sad but ok.

(In reply to Olli Pettay [:smaug] from comment #17)
> This would need some mozilla::PoisonWrite(); testing, or does that run on
> debug-OSX builds when
> running tests on try?

I don't think so. Any bad writes will crash on debug-OSX/win.
(Assignee)

Comment 19

5 years ago
(In reply to Benoit Girard (:BenWa) from comment #18)
> I don't think so. Any bad writes will crash on debug-OSX/win.
Even on try? Or does mozilla::PoisonWrite() require some special build?
(Assignee)

Comment 20

5 years ago
Though, if someone relies one these GCs or CCs, it would be clearly a bug in such code.
(Reporter)

Comment 21

5 years ago
The behavior for platforms with PoisonWrite supported (win/osx):
Non-debug: If telemetry is enabled they will be reported
Debug: Crash on a shutdown write after a certain point.
(Assignee)

Comment 22

5 years ago
Created attachment 704709 [details] [diff] [review]
quit-application

Need to use different notification, since in theory new windows can be opened
after *-granted.
https://tbpl.mozilla.org/?tree=Try&rev=7cd20635e0f5
Attachment #704682 - Attachment is obsolete: true
Attachment #704709 - Flags: review?(continuation)
Comment on attachment 704709 [details] [diff] [review]
quit-application

Review of attachment 704709 [details] [diff] [review]:
-----------------------------------------------------------------

I don't know enough about the shutdown sequence to know if "quit-application" is a good place to kill the GCs and CCs, but assuming that it is, this looks good to me.

::: dom/base/nsJSEnvironment.cpp
@@ +230,5 @@
>                                     nsJSContext::NonCompartmentGC,
>                                     nsJSContext::ShrinkingGC);
>      nsJSContext::CycleCollectNow();
> +  } else if (!nsCRT::strcmp(aTopic, "quit-application")) {
> +    sShuttingDown = true;

Do we want to set sShuttingDown to true in nsJSRuntime::Shutdown, too?

@@ +231,5 @@
>                                     nsJSContext::ShrinkingGC);
>      nsJSContext::CycleCollectNow();
> +  } else if (!nsCRT::strcmp(aTopic, "quit-application")) {
> +    sShuttingDown = true;
> +    nsJSContext::KillGCTimer();

Please factor these Kill*Timer functions into a KillTimer() function and call it here and in nsJSRuntime. It can just be a static function if you don't want to touch the header for nsJSContext.

@@ +3263,5 @@
>  
>    // The GC has more work to do, so schedule another GC slice.
>    if (aProgress == js::GC_SLICE_END) {
>      nsJSContext::KillInterSliceGCTimer();
> +    if (!sShuttingDown) {

It seems a little weird to me that we will just stop in the middle of an incremental GC, but I guess it shouldn't matter.
Attachment #704709 - Flags: review?(continuation) → review+
(Assignee)

Comment 24

5 years ago
Created attachment 705014 [details] [diff] [review]
comments addressed
Attachment #704709 - Attachment is obsolete: true
(Reporter)

Comment 25

5 years ago
I'm labeling this as a Snappy:P1 because the profiler' shutdown performance reporter profile seems to indicate that this fix will give us a big win!
Whiteboard: [Snappy:P1][sps]
Attachment #705014 - Flags: review+
(Assignee)

Comment 26

5 years ago
https://hg.mozilla.org/mozilla-central/rev/2876e73c9b6f
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
(Assignee)

Comment 27

5 years ago
After this bug 833143 may start to show up more badly in the profiles, since there is probably
more to collect.
Target Milestone: --- → mozilla21
relnote-firefox: --- → ?
(In reply to Benoit Girard (:BenWa) from comment #25)
> I'm labeling this as a Snappy:P1 because the profiler' shutdown performance
> reporter profile seems to indicate that this fix will give us a big win!

Do we have any hard numbers we can include as part of the notes ("up to X% improvement in shutdown time")?
relnote-firefox: ? → 19+
Flags: needinfo?(bjacob)
(Assignee)

Comment 29

4 years ago
This isn't too important change comparing to bug 818739.
(Reporter)

Comment 30

4 years ago
(In reply to Alex Keybl [:akeybl] from comment #28)
> (In reply to Benoit Girard (:BenWa) from comment #25)
> > I'm labeling this as a Snappy:P1 because the profiler' shutdown performance
> > reporter profile seems to indicate that this fix will give us a big win!
> 
> Do we have any hard numbers we can include as part of the notes ("up to X%
> improvement in shutdown time")?

Not from this particular patch but overall in Firefox21 Telemetry shows us to shutdown about 50% faster. bug 818739 is responsible for most of the win.
Flags: needinfo?(bjacob)
(Assignee)

Comment 31

4 years ago
bug 818739 is already in Aurora, so FF20 ;)
I had this problem today while investigating an issue with Olli. Sadly I run Aurora 20.0a2. I will keep an eye on it once I upgraded to 21.0a2.

Here the profiler graph which shows that shutdown was blocked by about additional 20s.
You need to log in before you can comment on or make changes to this bug.