Closed Bug 1126245 Opened 5 years ago Closed 5 years ago

[e10s] Crash on Disqus Password Change


(Firefox :: Untriaged, defect, critical)

38 Branch
Windows 8.1
Not set



Firefox 38
Tracking Status
e10s m5+ ---
firefox38 --- fixed


(Reporter: vicenzi.alexandre, Assigned: billm)



(Keywords: crash, regression, reproducible)

Crash Data


(4 files)

User Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:38.0) Gecko/20100101 Firefox/38.0
Build ID: 20150126030202

Steps to reproduce:

Access this URL:

Click to "Login" with Disqus.
It will prompt a new window.
Click "Forgot password".

Actual results:

All tabs crashed.


Expected results:

Not crash.
Severity: normal → critical
Crash Signature: [@ WaitForSingleObjectEx | WaitForSingleObject | PR_WaitCondVar | mozilla::CondVar::Wait(unsigned int)]
tracking-e10s: --- → ?
Ever confirmed: true
Keywords: crash, reproducible
Attached file testcase
1. Open attached
2. Click a button to open popup window
3. Click a link in the popup
Noting that this has the same crash signature as bug 1116884.
Thanks Alice. This looks like a great test case.
Assignee: nobody → wmccloskey
Flags: needinfo?(wmccloskey)
Agree, this is reproducible with testcase.

But I could not reproduce in my one attempt with on vista laptop (where I am perhaps seeing bug 1116884)
Attached patch window-testSplinter Review
This tests various ways of opening windows. Mostly we just want to avoid crashing, so it doesn't check much.
Attachment #8557406 - Flags: review?(bent.mozilla)
Attached file crash.html
Here's a testcase similar to Alice's but with a couple variations. It's similar to the automated test, but it's easier to run it manually.
Attached patch window-fixSplinter Review
This turned out to be a real rat's nest. The basic problem is that window closing is very racy in e10s. Bug 567058 changed the timing a bit so that these bugs are easier to hit, but I don't think it changed anything fundamental.

The test case runs as follows:
1. User clicks on a link whose onclick handler closes the current window (but doesn't do preventDefault()).
2. The window.close() call sends a message to the parent asking it to close the tab.
3. Since the tab is the last one in the chrome window, the parent closes the chrome window.
4. Closing the chrome window is not immediate. Instead we spins the event loop twice and then actually close it:
(notice the indirect parameter).
5. Meanwhile in the child process, the onclick handler returns and we try to open the link in a new tab. Doing so takes a few messages: we need to make a PBrowser, send CreateWindow, and then create a PRenderFrame. This stuff needs to finish before the chrome window is closed in the parent or else the parent's message handler will return false and we'll intentionally kill the child.

There are a few things we can depend on. Because messages are dispatched in the order sent, the parent will call window.close() on chrome window before it dispatches the CreateWindow message. Also, although the chrome window won't be closed right away (because the two spins of the event loop), at least GetClosed() will return true starting when window.close() is called.

Therefore, CreateWindow can at least detect when we're in this bad situation (by calling GetClosed). However, it then needs to decide what to do. Non-e10s Firefox's behavior here is a little strange in my opinion. In the crash.html test I attached, there are three variations of the test. In the first two variations, non-e10s Firefox opens the link in a new tab and that tab stays open. In the last variation, the link is never opened [1]. Chrome's behavior seems more reasonable to me. It always opens the link in a new tab and leaves the tab open. If there isn't an obvious window to open it in, it picks the most recent window.

I decided to copy Chrome's behavior in this patch. If CreateWindow detects that we're trying to open a new tab in a window that has been closed, then it finds the most recent open window and opens the tab there instead. This fixes the crash.

Olli, I asked you for feedback to see if this behavior change is okay.

Note that there's still a race here. If the user closes a window at just the time that something in the child calls, then the parent might still terminate the child. For example, if the window gets closed after CreateWindow returns but before the PRenderFrame constructor is processed, then AllocPRenderFrameParent will return null since it can't find a frameloader. Then the child will be killed. I think this should probably be addressed in a follow-up bug.

[1] It's not opened because we return early here:
Attachment #8557414 - Flags: review?(bent.mozilla)
Attachment #8557414 - Flags: feedback?(bugs)
Comment on attachment 8557414 [details] [diff] [review]

I guess this should be fine.
Attachment #8557414 - Flags: feedback?(bugs) → feedback+
Attachment #8557406 - Flags: review?(bent.mozilla) → review+
Comment on attachment 8557414 [details] [diff] [review]

Review of attachment 8557414 [details] [diff] [review]:

::: dom/ipc/TabParent.cpp
@@ +505,5 @@
> +
> +  nsCOMPtr<nsIDOMWindow> latest;
> +
> +  bool hasMore = false;
> +  windowEnumerator->HasMoreElements(&hasMore);


@@ +508,5 @@
> +  bool hasMore = false;
> +  windowEnumerator->HasMoreElements(&hasMore);
> +  while (hasMore) {
> +    nsCOMPtr<nsISupports> item;
> +    windowEnumerator->GetNext(getter_AddRefs(item));

and here
Attachment #8557414 - Flags: review?(bent.mozilla) → review+
Closed: 5 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → Firefox 38
Mistakenly filed against Firefox 38 and should be instead 38 Branch. Sorry for the spam. dkl
Version: Firefox 38 → 38 Branch
You need to log in before you can comment on or make changes to this bug.