Closed Bug 643083 Opened 13 years ago Closed 6 years ago

Crash: [@ JSAutoRequest::~JSAutoRequest]

Categories

(Core :: DOM: Content Processes, defect, P5)

defect

Tracking

()

RESOLVED WONTFIX

People

(Reporter: stechz, Unassigned)

Details

(Keywords: crash)

Crash Data

Attachments

(2 files, 1 obsolete file)

In TabParent.cpp:607, it is possible for a JS context to be destroyed during the ReceiveMessage callback (for instance, if a window is closed). This makes ctx a dangling pointer and causes a crash in ~JSAutoRequest.

There could be other invariants being violated in this code. I don't know it well enough to say.
This hack fixes the crash for me, but I worry that we may want something more
thorough. Need a testcase too.
Attachment #520388 - Flags: feedback?(Olli.Pettay)
We probably shouldn't be holding the request across the call to ReceiveMessage...  That is, scope the request to right around the JS_NewArrayObject call.
The hardest part of this patch was the test! It's not easy to get the window
to close on the same event where you receive a message. bz's suggestion fixes
the crash, and without the patch the crash occurs with this browser chrome
test.
Attachment #520409 - Flags: review?(Olli.Pettay)
Attachment #520388 - Attachment is obsolete: true
Attachment #520388 - Flags: feedback?(Olli.Pettay)
Comment on attachment 520409 [details] [diff] [review]
Crash: JSAutoRequest::~JSAutoRequest

># HG changeset patch
># User Benjamin Stover <bstover@mozilla.com>
># Date 1300518693 25200
># Node ID 1556234b9baf8302670765866dfefb4928faedd1
># Parent  00a73b6f85e488a78587ffa6d48331cb3959746b
>Bug 643083 Crash: JSAutoRequest::~JSAutoRequest
>
>diff --git a/dom/ipc/TabParent.cpp b/dom/ipc/TabParent.cpp
>--- a/dom/ipc/TabParent.cpp
>+++ b/dom/ipc/TabParent.cpp
>@@ -584,24 +584,30 @@ bool
> TabParent::ReceiveMessage(const nsString& aMessage,
>                           PRBool aSync,
>                           const nsString& aJSON,
>                           InfallibleTArray<nsString>* aJSONRetVal)
> {
>   nsRefPtr<nsFrameLoader> frameLoader = GetFrameLoader();
>   if (frameLoader && frameLoader->GetFrameMessageManager()) {
>     nsFrameMessageManager* manager = frameLoader->GetFrameMessageManager();
>-    JSContext* ctx = manager->GetJSContext();
>-    JSAutoRequest ar(ctx);
>-    PRUint32 len = 0; //TODO: obtain a real value in bug 572685
>-    // Because we want JS messages to have always the same properties,
>-    // create array even if len == 0.
>-    JSObject* objectsArray = JS_NewArrayObject(ctx, len, NULL);
>-    if (!objectsArray) {
>-      return false;
>+
>+    // Context may be gone after calling ReceiveMessage, so scope the
>+    // context pointer to prevent dangling.
>+    JSObject* objectsArray;
>+    {
>+      JSContext* ctx = manager->GetJSContext();
>+      JSAutoRequest ar(ctx);
>+      PRUint32 len = 0; //TODO: obtain a real value in bug 572685
>+      // Because we want JS messages to have always the same properties,
>+      // create array even if len == 0.
>+      objectsArray = JS_NewArrayObject(ctx, len, NULL);
>+      if (!objectsArray) {
>+        return false;
>+      }
>     }
I assume objectsArray is kept alive because of conservative GC.
Could you ask someone from js team?
Attachment #520409 - Flags: review?(Olli.Pettay) → review+
Comment on attachment 520409 [details] [diff] [review]
Crash: JSAutoRequest::~JSAutoRequest

cdleary, do you know? Or who would be a good person to ask?
Attachment #520409 - Flags: feedback?(cdleary)
Comment on attachment 520409 [details] [diff] [review]
Crash: JSAutoRequest::~JSAutoRequest

All JSObject pointers that are present on the machine stack or in registers will be recognized as reachable objects by conservative stack scanning.

Hazards usually come into play when you extract some stuff from an object's guts and pass that around without temporarily rooting the containing object (like extracting jschar * from a JSString). Without looking into the details of ReceiveMessage, this looks like it will be okay.

+      if (!objectsArray) {
+        return false;
+      }

Should probably be NULL.
Attachment #520409 - Flags: feedback?(cdleary) → feedback+
Pushed http://hg.mozilla.org/mozilla-central/rev/868be4c08700
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Backed out due to test crash / leak failures. This probably has something to do with having a remote browser.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
The leak can be found by running the dom browser-tests by themselves:

python runtests.py --browser-chrome --test-path=dom/tests/browser

I've modified my makefile so just the one test runs. The leaks I'm seeing in the child process:
         |<----------------Class--------------->|<-----Bytes------>|<----------------Objects---------------->|<--------------References-------------->|
                                                  Per-Inst   Leaked    Total      Rem      Mean       StdDev     Total      Rem      Mean       StdDev
       0 TOTAL                                          22      776    31048       11 (  669.64 +/-   855.10)    55933        5 ( 1477.21 +/-  2482.61)
      53 PuppetWidget                                  216      432        2        2 (    1.50 +/-     0.71)       29        1 (    4.93 +/-     2.22)
      90 gfxASurface                                    24       24       12        1 (    4.52 +/-     1.97)        0        0 (    0.00 +/-     0.00)
     115 nsBaseWidget                                  124      124        1        1 (    1.00 +/-     0.00)       29        1 (    4.93 +/-     2.22)
     237 nsGTKToolkit                                   32       32        1        1 (    1.00 +/-     0.00)        1        1 (    1.00 +/-     0.00)
     326 nsRect                                         16       48      104        3 (   24.00 +/-    15.17)        0        0 (    0.00 +/-     0.00)
     342 nsScreenManagerGtk                             28       28        1        1 (    1.00 +/-     0.00)        8        1 (    2.40 +/-     0.99)
     399 nsThebesDeviceContext                          84       84        4        1 (    1.43 +/-     0.53)       37        1 (    4.12 +/-     2.21)
     416 nsVoidArray                                     4        4      483        1 (   98.05 +/-    48.00)        0        0 (    0.00 +/-     0.00) 

TabChild appears to be cleaning up after itself fine (DestroyWindow is called). nsThebesDeviceContext is getting leaked, but not by an nsCOMPtr according to find-comptr-leakers.
Severity: normal → critical
Keywords: crash
Summary: Crash: JSAutoRequest::~JSAutoRequest → Crash: [@ JSAutoRequest::~JSAutoRequest]
Attachment #522491 - Flags: review?(jones.chris.g)
Attachment #522491 - Flags: superreview?(roc)
Chris and Roc,

Quick catch-up. This followup patch fixes a leak for PuppetWidget that was occurring on mochitests for this bug. Olli suggested this change. I suppose that the test in this bug must be the first test that checks for leaks.
Attachment #522491 - Flags: superreview?(roc) → superreview+
Comment on attachment 522491 [details] [diff] [review]
Test fix: clean up puppet widget properly

Seems like maybe BaseWidget::Destroy perhaps ought to be doing this, but if roc is happy I'm happy.
Attachment #522491 - Flags: review?(jones.chris.g) → review+
Backed out again. Keeping the tabchild cleanup which was pushed here:
http://hg.mozilla.org/mozilla-central/rev/4d7a0a6dd613

There's a permaish-orange for mochitest-other on OS X and Windows. :(
In retrospect and now that the builds have fully completed, looking at the tryserver suggests that maybe I just saw a few random oranges happen? I need a second opinion.
I can look, but which changeset is yours on try?  How long ago was this?
bz: the patch got several runs with mochitests before I backed it out, so it was a series of runs last night. I believe it added a new random orange like this one:

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1301446512.1301451126.7332.gz&fulltext=1#err0

smaug and I have talked it over on IRC. This looks to be a problem with a frame script getting loaded late into the shutdown process. One problem was that the test was using |true| with loadFrameScript which is not what we ever want to do for a test run! Another problem is that we shouldn't be receiving messages from the child process so late. Filing a bug for the second one and trying |false| for this patch so we can get this landed!
Crash Signature: [@ JSAutoRequest::~JSAutoRequest]
Component: IPC → DOM: Content Processes
https://bugzilla.mozilla.org/show_bug.cgi?id=1472046

Move all DOM bugs that haven’t been updated in more than 3 years and has no one currently assigned to P5.

If you have questions, please contact :mdaly.
Priority: -- → P5
Closing because no crashes reported for 12 weeks.
Status: REOPENED → RESOLVED
Closed: 13 years ago6 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: