Last Comment Bug 785248 - Ghost window on dirtymarkup.com
: Ghost window on dirtymarkup.com
Status: RESOLVED FIXED
[MemShrink:P2]
:
Product: Core
Classification: Components
Component: DOM (show other bugs)
: Trunk
: x86_64 Linux
: -- normal (vote)
: mozilla18
Assigned To: Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary)
:
Mentors:
http://www.dirtymarkup.com/
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-08-23 15:21 PDT by Timvde
Modified: 2012-10-22 08:04 PDT (History)
12 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
about:memory report for dirtymarkup.com (13.24 KB, text/plain)
2012-08-23 15:21 PDT, Timvde
no flags Details
patch (1.91 KB, patch)
2012-09-11 04:58 PDT, Olli Pettay [:smaug] (vacation Aug 25-28)
no flags Details | Diff | Splinter Review
patch (1.95 KB, patch)
2012-09-11 05:42 PDT, Olli Pettay [:smaug] (vacation Aug 25-28)
no flags Details | Diff | Splinter Review
minimal testcase (68 bytes, text/html)
2012-09-13 03:39 PDT, Olli Pettay [:smaug] (vacation Aug 25-28)
no flags Details
patch (1.76 KB, patch)
2012-09-13 05:43 PDT, Olli Pettay [:smaug] (vacation Aug 25-28)
bent.mozilla: review-
amarchesini: feedback+
Details | Diff | Splinter Review
Patch (2.02 KB, patch)
2012-10-02 15:52 PDT, Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary)
bent.mozilla: review+
akeybl: approval‑mozilla‑aurora-
Details | Diff | Splinter Review

Description Timvde 2012-08-23 15:21:25 PDT
Created attachment 654814 [details]
about:memory report for dirtymarkup.com

about:memory reports two ghost windows on http://www.dirtymarkup.com/. This happens on an (almost¹) clean profile. Just opening and closing the page doesn't seem to create the ghost window. Clicking a few buttons/dropdowns does (even without running any of the scripts).

Relevant about:memory?verbose parts are attached. These are taken *after* closing the tab and clicking the "Minimize memory usage".

about:compartments also still lists http://dirtymarkup.com/ as a user compartment.

¹ The profile is "almost" clean in the way that I always use it to test things. It has no add-ons, but does have a few plug-ins (flash and Java) and some history.
Comment 1 Matthias Versen [:Matti] 2012-08-23 15:47:54 PDT
Your report is for which Firefox version ?
Comment 2 Timvde 2012-08-23 16:21:58 PDT
Oh, right, when choosing "Advanced bug form", that isn't attached :)
I did select the appropriate version/platform; it's on 2012-08-23 Nightly (version 17), Linux x64.
Comment 3 Olli Pettay [:smaug] (vacation Aug 25-28) 2012-08-23 16:22:16 PDT
about:cc shows a document in the CC graph after closing the tab.
Comment 4 Olli Pettay [:smaug] (vacation Aug 25-28) 2012-08-23 17:27:35 PDT
So I see a runtime leak which gets released during shutdown.
The most suspicious release is the following, but perhaps the problem is elsewhere.
(I know nothing about how workers get destroyed.)
(I *think* #4 is reporting wrong class name)

0  nsDocument::Release (this=0xaf2e8400) at /home/smaug/mozilla/hg/mozilla/content/base/src/nsDocument.cpp:1711
#1  0xb4e555a3 in nsHTMLDocument::Release (this=0xaf2e8400) at /home/smaug/mozilla/hg/mozilla/content/html/document/src/nsHTMLDocument.cpp:239
#2  0xb459040d in nsCOMPtr_base::~nsCOMPtr_base (this=0x9fc01f00, __in_chrg=<optimized out>) at ../../dist/include/nsCOMPtr.h:408
#3  0xb45904c8 in nsCOMPtr<nsISupports>::~nsCOMPtr (this=0x9fc01f00, __in_chrg=<optimized out>) at ../../dist/include/nsCOMPtr.h:833
#4  0xb48b9813 in nsPresState::~nsPresState (this=0x9fc01f00, __in_chrg=<optimized out>) at /home/smaug/mozilla/hg/mozilla/layout/base/nsPresState.h:19
#5  0xb4fa76b9 in nsTArray<nsCOMPtr<nsISupports>, nsTArrayDefaultAllocator>::DestructRange (this=0x9fce9250, start=0, count=7) at ../../dist/include/nsTArray.h:1213
#6  0xb4fa6dce in nsTArray<nsCOMPtr<nsISupports>, nsTArrayDefaultAllocator>::RemoveElementsAt (this=0x9fce9250, start=0, count=7) at ../../dist/include/nsTArray.h:933
#7  0xb4fa5c1e in nsTArray<nsCOMPtr<nsISupports>, nsTArrayDefaultAllocator>::Clear (this=0x9fce9250) at ../../dist/include/nsTArray.h:944
#8  0xb50ae518 in (anonymous namespace)::MainThreadReleaseRunnable::Run (this=0x9fce9240) at /home/smaug/mozilla/hg/mozilla/dom/workers/WorkerPrivate.cpp:543
#9  0xb5cbff89 in nsThread::ProcessNextEvent (this=0xb7d35510, mayWait=true, result=0xbfffc8cf) at /home/smaug/mozilla/hg/mozilla/xpcom/threads/nsThread.cpp:624
#10 0xb5c5981d in NS_ProcessNextEvent_P (thread=0xb7d35510, mayWait=true) at /home/smaug/mozilla/hg/mozilla/ff_build/xpcom/build/nsThreadUtils.cpp:220
#11 0xb5cbf9a3 in nsThread::Shutdown (this=0xb2d7eb30) at /home/smaug/mozilla/hg/mozilla/xpcom/threads/nsThread.cpp:466
#12 0xb50a4dd7 in mozilla::dom::workers::RuntimeService::NoteIdleThread (this=0xaed61b60, aThread=0xb2d7eb30)
    at /home/smaug/mozilla/hg/mozilla/dom/workers/RuntimeService.cpp:1192
#13 0xb50ae948 in (anonymous namespace)::TopLevelWorkerFinishedRunnable::Run (this=0x9fce9f40) at /home/smaug/mozilla/hg/mozilla/dom/workers/WorkerPrivate.cpp:644
#14 0xb5cbff89 in nsThread::ProcessNextEvent (this=0xb7d35510, mayWait=true, result=0xbfffcadf) at /home/smaug/mozilla/hg/mozilla/xpcom/threads/nsThread.cpp:624
#15 0xb5c5981d in NS_ProcessNextEvent_P (thread=0xb7d35510, mayWait=true) at /home/smaug/mozilla/hg/mozilla/ff_build/xpcom/build/nsThreadUtils.cpp:220
#16 0xb50a449c in mozilla::dom::workers::RuntimeService::Cleanup (this=0xaed61b60) at /home/smaug/mozilla/hg/mozilla/dom/workers/RuntimeService.cpp:1023
#17 0xb50a5767 in mozilla::dom::workers::RuntimeService::Observe (this=0xaed61b60, aSubject=0x0, aTopic=0xb6ea224b "xpcom-shutdown-threads", aData=0x0)
    at /home/smaug/mozilla/hg/mozilla/dom/workers/RuntimeService.cpp:1244
Comment 5 Olli Pettay [:smaug] (vacation Aug 25-28) 2012-08-24 12:47:09 PDT
nsGlobalWindow::FreeInnerObjects() is called on the nsGlobalWindow which holds pointer to
the leaked mDoc. Yet worker seems to release stuff only during shutdown.
Comment 6 Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2012-08-27 14:42:53 PDT
I can't reproduce this.
Comment 7 Olli Pettay [:smaug] (vacation Aug 25-28) 2012-09-03 11:15:45 PDT
Somehow http://mxr.mozilla.org/mozilla-central/source/dom/workers/WorkerPrivate.cpp?mark=1982-1984#1972 is causing this, I believe.
(Commenting it out fixes the leak.)
Still trying to understand all this code.
Comment 8 Olli Pettay [:smaug] (vacation Aug 25-28) 2012-09-10 05:59:14 PDT
We end up dispatching canceling status in NotifyPrivate, but somehow don't exit from the worker loop, 
which then means we don't call ScheduleDeletion.
Comment 9 Olli Pettay [:smaug] (vacation Aug 25-28) 2012-09-10 10:18:49 PDT
I don't quite understand how the return value of ::Run should be handled.
We clearly return error code in some cases, but don't always check it, but do occasionally.
Comment 10 Olli Pettay [:smaug] (vacation Aug 25-28) 2012-09-11 04:58:34 PDT
Created attachment 660037 [details] [diff] [review]
patch

Fixes the leak but there are other cases when the return value of Run() isn't
checked,.
Comment 11 Olli Pettay [:smaug] (vacation Aug 25-28) 2012-09-11 05:42:22 PDT
Created attachment 660046 [details] [diff] [review]
patch

Let closing to succeed normally.

Ben, as you know, I'm not familiar with this code, so if you have better ideas
how to handle the situation, feel free to write a patch ;)

WorkerPrivate::NotifyInternal and WorkerPrivate::DoRunLoop just don't play
well with each others.
Comment 12 Olli Pettay [:smaug] (vacation Aug 25-28) 2012-09-11 05:48:51 PDT
https://tbpl.mozilla.org/?tree=Try&rev=fe163a9022cf
Comment 13 Olli Pettay [:smaug] (vacation Aug 25-28) 2012-09-11 06:09:46 PDT
And to be clear, the patch is especially for 
http://mxr.mozilla.org/mozilla-central/source/dom/workers/WorkerPrivate.cpp?rev=70a0eda09c3c&mark=3490-3491#3475
Comment 14 Olli Pettay [:smaug] (vacation Aug 25-28) 2012-09-13 03:39:50 PDT
Created attachment 660780 [details]
minimal testcase
Comment 15 Olli Pettay [:smaug] (vacation Aug 25-28) 2012-09-13 05:43:18 PDT
Created attachment 660814 [details] [diff] [review]
patch

This one then. Trying to keep the existing behavior, but still
ensure that we exit the event loop when mCloseHandlerFinished becomes true.
Comment 16 Andrea Marchesini [:baku] 2012-09-23 01:36:32 PDT
Comment on attachment 660814 [details] [diff] [review]
patch

Review of attachment 660814 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/workers/WorkerPrivate.cpp
@@ +718,2 @@
>      JSObject* target = JS_GetGlobalObject(aCx);
> +    if (!target) {

I would like to read a comment here about why if !target, return true.
Comment 17 Ben Turner (not reading bugmail, use the needinfo flag!) 2012-10-02 14:58:29 PDT
Comment on attachment 660814 [details] [diff] [review]
patch

khuey caught this in a recording and we looked it over today. This patch is close but we have a better one worked out.
Comment 18 Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2012-10-02 15:52:16 PDT
Created attachment 667221 [details] [diff] [review]
Patch
Comment 19 Ben Turner (not reading bugmail, use the needinfo flag!) 2012-10-02 16:02:25 PDT
Comment on attachment 667221 [details] [diff] [review]
Patch

Review of attachment 667221 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/workers/WorkerPrivate.cpp
@@ +3494,5 @@
>      return true;
>    }
>  
> +  // If the worker script never ran, we don't need to do anything else, except
> +  // pretend that we ran the close handler.

Nit: This comment isn't totally right - needs to say something like "If the worker script never ran or failed to compile"

@@ +3507,1 @@
>    if (previousStatus == Running) {

Nit: We can combine these if blocks now right? E.g.:

  if (previousStatus == Running && aStatus != Killing) {
Comment 20 Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2012-10-02 19:07:50 PDT
(In reply to ben turner [:bent] from comment #19)
> Comment on attachment 667221 [details] [diff] [review]
> Patch
> 
> Review of attachment 667221 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: dom/workers/WorkerPrivate.cpp
> @@ +3494,5 @@
> >      return true;
> >    }
> >  
> > +  // If the worker script never ran, we don't need to do anything else, except
> > +  // pretend that we ran the close handler.
> 
> Nit: This comment isn't totally right - needs to say something like "If the
> worker script never ran or failed to compile"

If it fails to compile it never runs ...

> @@ +3507,1 @@
> >    if (previousStatus == Running) {
> 
> Nit: We can combine these if blocks now right? E.g.:
> 
>   if (previousStatus == Running && aStatus != Killing) {

Done.

https://hg.mozilla.org/integration/mozilla-inbound/rev/6e04928c99aa
Comment 21 Ed Morley [:emorley] 2012-10-03 06:02:20 PDT
https://hg.mozilla.org/mozilla-central/rev/6e04928c99aa
Comment 22 Ben Turner (not reading bugmail, use the needinfo flag!) 2012-10-03 10:17:35 PDT
Comment on attachment 667221 [details] [diff] [review]
Patch

[Approval Request Comment]
Bug caused by (feature/regressing bug #): Worker rewrite, long ago.
User impact if declined: Intermittent CC slowdowns that last until app shutdown.
Testing completed (on m-c, etc.): m-c
Risk to taking this patch (and alternatives if risky): It's possible we've introduced a new race here but it is quite unlikely.
String or UUID changes made by this patch: None
Comment 23 Alex Keybl [:akeybl] 2012-10-03 15:10:59 PDT
(In reply to ben turner [:bent] from comment #22)
> Comment on attachment 667221 [details] [diff] [review]
> Patch
> 
> [Approval Request Comment]
> Bug caused by (feature/regressing bug #): Worker rewrite, long ago.
> User impact if declined: Intermittent CC slowdowns that last until app
> shutdown.

This isn't a new regression, and we don't have a lot of evidence pointing to the issue being prevalent. Do you have any data to the contrary?
Comment 24 Ben Turner (not reading bugmail, use the needinfo flag!) 2012-10-03 15:20:08 PDT
(In reply to Alex Keybl [:akeybl] from comment #23)
> This isn't a new regression, and we don't have a lot of evidence pointing to
> the issue being prevalent. Do you have any data to the contrary?

I don't think we have ever had any reliable way of collecting that data... Maybe telemetry on slow CC times? Olli may have a better answer.
Comment 25 Olli Pettay [:smaug] (vacation Aug 25-28) 2012-10-03 16:27:34 PDT
Even if telemetry may show some slower than expected CC times, it is not possible to
map those CC times to this bug.
But using workers is getting more common all the time...
Comment 26 Alex Keybl [:akeybl] 2012-10-05 15:27:17 PDT
Comment on attachment 667221 [details] [diff] [review]
Patch

Let's let this ride the trains given we think this carries non-zero risk and is a longstanding regression. There's only about 6 weeks till FF17's release, so it's time to start limiting risk.
Comment 27 Tim Guan-tin Chien [:timdream] (please needinfo) 2012-10-22 02:03:25 PDT
I found that the minimal testcase comes with the ability to break web workers. When the web workers broke, the only way to use it again is to restart the browser :-/

http://people.mozilla.com/~tchien/worker-loop/

The number in this test case stops at 18.

Note You need to log in before you can comment on or make changes to this bug.