Closed Bug 1280105 Opened 3 years ago Closed 3 years ago

Crash in mozilla::dom::TabChild::RecvSwappedWithOtherRemoteLoader: MOZ_CRASH("Update to TabContext after swap was denied.");

Categories

(Core :: IPC, defect, P1, critical)

Unspecified
Windows 10
defect

Tracking

()

VERIFIED FIXED
mozilla50
Tracking Status
firefox48 --- unaffected
firefox49 --- disabled
firefox50 --- verified
firefox51 --- fixed

People

(Reporter: njn, Assigned: baku)

References

(Blocks 1 open bug)

Details

(Keywords: crash, Whiteboard: [userContextId][OA])

Crash Data

Attachments

(2 files, 2 obsolete files)

This bug was filed from the Socorro interface and is 
report bp-eab116fe-b52c-4106-b8dc-e81512160604.
=============================================================

This crash first appeared in Nightly 20160603030242. Since then it has occurred 37 times.

jryans, any ideas?
Flags: needinfo?(jryans)
My first suspect would be some variant of the containers workflow.  :jhao / :kjozwiak, have you seen this crash with 20160603030242 and later?  Bug 1275485 and bug 1271792 are related to this, however they've both been resolved.  Perhaps there is another issue still unsolved?
Flags: needinfo?(kjozwiak)
Flags: needinfo?(jryans)
Flags: needinfo?(jhao)
(In reply to J. Ryan Stinnett [:jryans] (use ni?) (at work week until June 21) from comment #1)
> My first suspect would be some variant of the containers workflow.  :jhao /
> :kjozwiak, have you seen this crash with 20160603030242 and later?  Bug
> 1275485 and bug 1271792 are related to this, however they've both been
> resolved.  Perhaps there is another issue still unsolved?

Perhaps there is an unsolved issue, but I really can't tell without further information.
Flags: needinfo?(jhao)
See Also: → 1282589
Whiteboard: [userContextId][OA]
Flags: needinfo?(ptheriault)
Priority: -- → P1
(In reply to David Baron :dbaron: ⌚️UTC+8 (review requests must explain patch) from comment #4)
> Could the leak fix in
> https://hg.mozilla.org/mozilla-central/rev/84a2a85ac27e have triggered this?
Flags: needinfo?(bugs)
Hmm, that fix was actually for non-e10s only. On child process 
window->GetChromeEventHandler() returns the same object as window->GetParentTarget();

My guess is that aContext.mOriginAttributes != mOriginAttributes is true so 
TabContext::UpdateTabContextAfterSwap return false.
Flags: needinfo?(bugs) → needinfo?(amarchesini)
I have a simple STR that works 100% of the times. Smaug is right, the problem is in OriginAttributes.
And in particular the problem is: mPrivateBrowsingId.
I'm workin on it.
Assignee: nobody → amarchesini
Flags: needinfo?(amarchesini)
Attached patch crash.patch (obsolete) — Splinter Review
I think this is the correct fix because TabContext is received by the CTOR of TabChild and it should not changed. What that line does it to update the privateBrowsing flag into originAttributes checking for a chrome window flag.
That is fine for the parent window, but the window in the child process doesn't have that flag. But more important, we should change the private Browsing flag at all. We should use what we receive from the parent process.

The TabContext received by the parent process has privateBrowsing attribute set to 1. After that call, it's set to 0 and the comparison with the other TabContext fails.
Attachment #8772580 - Flags: review?(bugs)
I forgot to share the STR:

1. private browsing
2. in about:preferences we must set the flag: "Open new windows in a new tab instead".
3. a simple page that does: window.open(something, 'foobar', '', true);
4. when the new tab is opened, drag it out. it should create a new window.
5. we crash comparing TabContext of the new window (with privateBrowsing set to 1) and TabChild::mOriginAttributes (that has privateBrowsing set to 0).
Flags: needinfo?(ptheriault)
Flags: needinfo?(kjozwiak)
Duplicate of this bug: 1282589
Thanks baku, Kamil and codacoder for figuring this out!  Per baku the regression is caused by https://bugzilla.mozilla.org/show_bug.cgi?id=1269361 which landed on 6-3-16.  So this will need to get uplifted to aurora.
Status: NEW → ASSIGNED
Tanvi, it looks like the issue does start to appear in build 20160603030242. However, rather than crashing the tabs when detaching them into their own separate windows, the pages will appear completely blank. Let me know if it would be useful to find a regression range and pin point the exact build that started to crash the tabs.

Applying the patch from comment #8 fixes the issue. I tried going through the STR below in a m-c debug build and I couldn't reproduce crashes.

Builds Used:
============

* fx49.0a1, buildId: 20160602030220, changeset: 34a8be4346a9 - Couldn't reproduce the crash
** https://archive.mozilla.org/pub/firefox/nightly/2016/06/2016-06-02-03-02-20-mozilla-central/

* fx49.0a1, buildId: 20160603030242, changeset: e27fe24a746f - Doesn't reproduce the crash, but the tabs start appearing as blanks
** https://archive.mozilla.org/pub/firefox/nightly/2016/06/2016-06-03-03-02-42-mozilla-central/

STR #1:
=======

* download the latest version of m-c
* open a private browsing window via the hamburger menu
* open the poc.html in the private browsing window
* click on "Open Window"
* once duckduckgo opened, detach the tab into a seperate window

STR #2:
=======

* download the latest version of m-c
* go into about:preferences#privacy and select "Never Remember History"
* restart the browser via the prompt
* open the poc.html in the private browsing window
* click on "Open Window"
* once duckduckgo opened, detach the tab into a seperate window
Flags: needinfo?(tanvi)
Attached file poc.html
Thanks Kamil!  That sounds good enough for me!
Flags: needinfo?(tanvi)
FYI

(In reply to Kamil Jozwiak [:kjozwiak] from comment #13)
> Created attachment 8772969 [details]
> poc.html

Using the above in the latest nightly and my amended STR from https://bugzilla.mozilla.org/show_bug.cgi?id=1282589#c23

https://crash-stats.mozilla.com/report/index/a7d68e41-603a-472f-8eea-7da482160720
(In reply to Andrea Marchesini (:baku) from comment #8)
> Created attachment 8772580 [details] [diff] [review]
> crash.patch
> 
> I think this is the correct fix because TabContext is received by the CTOR
> of TabChild and it should not changed. What that line does it to update the
> privateBrowsing flag into originAttributes checking for a chrome window flag.
> That is fine for the parent window, but the window in the child process
> doesn't have that flag. But more important, we should change the private
> Browsing flag at all. We should use what we receive from the parent process.
> 
> The TabContext received by the parent process has privateBrowsing attribute
> set to 1. After that call, it's set to 0 and the comparison with the other
> TabContext fails.

But why doesn't mChromeFlags & nsIWebBrowserChrome::CHROME_PRIVATE_WINDOW have the same value as the pb flag in OA? That sounds like a bug to fix.
Perhaps we can remove
SetPrivateBrowsingAttributes(mChromeFlags & nsIWebBrowserChrome::CHROME_PRIVATE_WINDOW); but that should be just
no-op. We should be able to MOZ_ASSERT that if mChromeFlags & nsIWebBrowserChrome::CHROME_PRIVATE_WINDOW, then
also OA has pb set.
Comment on attachment 8772580 [details] [diff] [review]
crash.patch

We expect mChromeFlags & nsIWebBrowserChrome::CHROME_PRIVATE_WINDOW to be right elsewhere in the ::Init() too, so this patch just hides the issue in this one case.
Attachment #8772580 - Flags: review?(bugs) → review-
Attached patch crash.patch (obsolete) — Splinter Review
Attachment #8772580 - Attachment is obsolete: true
Attachment #8773237 - Flags: review?(bugs)
Blocks: 1288370
Comment on attachment 8773237 [details] [diff] [review]
crash.patch

>-  loadContext->SetPrivateBrowsing(
>-      mChromeFlags & nsIWebBrowserChrome::CHROME_PRIVATE_WINDOW);
>+  loadContext->SetPrivateBrowsing(OriginAttributesRef().mPrivateBrowsingId);
Perhaps 
loadContext->SetPrivateBrowsing(OriginAttributesRef().mPrivateBrowsingId > 0); for clarity
Attachment #8773237 - Flags: review?(bugs) → review+
Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/47d21cda1b6f
TabChild should not use CHROME_PRIVATE_WINDOW flag to set privateBrowsing, r=smaug
Attached patch crash.patchSplinter Review
Approval Request Comment
[Feature/regressing bug #]: 1269361
[User impact if declined]: creating a new window from a tab could make FF to crash.
[Describe test coverage new/current, TreeHerder]: manual tests.
[Risks and why]: none
[String/UUID change made/needed]: none
Attachment #8773237 - Attachment is obsolete: true
Attachment #8773240 - Flags: approval-mozilla-aurora?
https://hg.mozilla.org/mozilla-central/rev/47d21cda1b6f
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
Comment on attachment 8773240 [details] [diff] [review]
crash.patch

Fix a crash, taking it.
Attachment #8773240 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Reproduced the original issue using the following build:
* Crash: bp-5e8537c8-378c-4f62-b423-17dcf2160804
* Build: https://archive.mozilla.org/pub/firefox/nightly/2016/07/2016-07-20-03-02-08-mozilla-central/
** fx50.0a1, buildId: 20160720030208, changeset: ed8e23b5e0c7

Went through verification using the following builds:
* https://archive.mozilla.org/pub/firefox/nightly/2016/08/2016-08-04-03-04-41-mozilla-central/
* https://archive.mozilla.org/pub/firefox/nightly/2016/08/2016-08-04-00-40-04-mozilla-aurora/
* https://archive.mozilla.org/pub/firefox/candidates/49.0b1-candidates/build3/

Went through the test cases outlined in comment#12 without any issues on the following platforms:

* macOS 10.11.6 - PASSED
* Win 10 x64 VM - PASSED
* Ubuntu 14.04.5 LTS VM - PASSED

Codacoder, are you still seeing the original crash that you reported via bug#1282589? Or has his bug fixed the issue?
Flags: needinfo?(codacodercodacoder)
(In reply to Kamil Jozwiak [:kjozwiak] from comment #25)
> * macOS 10.11.6 - PASSED
> * Win 10 x64 VM - PASSED
> * Ubuntu 14.04.5 LTS VM - PASSED

 * Win7x64 Fx Nightly 51.0a1 (2016-08-04), History: never remember, e10s enabled

> 
> Codacoder, are you still seeing the original crash that you reported via
> bug#1282589? Or has his bug fixed the issue?

Bingo! Fixed.
Flags: needinfo?(codacodercodacoder)
(In reply to Codacoder from comment #26)
> Bingo! Fixed.

Awesome! Thanks for all your help with this :) It's definitely appreciated.
Status: RESOLVED → VERIFIED
So Im not sure if its the same bug (or maybe the same bug as  1282589 which was a dupe of this, but we are seeing crashes when tearing off container tabs again). See 1294237.
Blocks: 1269361
This was backed out from Firefox 49 in bug 1297687.
(In reply to :Ehsan Akhgari from comment #29)
> This was backed out from Firefox 49 in bug 1297687.

Does that mean this crash will still occur in Firefox 49?  I know you backed out other bugs as well, so maybe the issue doesn't exist in 49 after all the backouts?
Flags: needinfo?(ehsan)
(In reply to Tanvi Vyas [:tanvi] from comment #30)
> (In reply to :Ehsan Akhgari from comment #29)
> > This was backed out from Firefox 49 in bug 1297687.
> 
> Does that mean this crash will still occur in Firefox 49?  I know you backed
> out other bugs as well, so maybe the issue doesn't exist in 49 after all the
> backouts?

I'm not sure.  The only thing here that talks about the nature of the crash is comment 8 and 9, and they seem to suggest that the crash was caused by bug 1269361, contrary to what the dependencies of the bug suggested (before I fixed it a couple of days ago.)
Flags: needinfo?(ehsan)
That is to say, given the code talked about in comment 8 or 9 doesn't exist in 49 any more, I would expect the crash to also not exist.
(In reply to Andrea Marchesini [:baku] from comment #7)
> I have a simple STR that works 100% of the times. Smaug is right, the
> problem is in OriginAttributes.
> And in particular the problem is: mPrivateBrowsingId.
> I'm workin on it.

baku, what were your steps to reproduce?  Perhaps you or Kamil can go through those steps in Firefox 49 after the backout, just to confirm that we don't have a Firefox 49 crash.
Flags: needinfo?(kjozwiak)
Flags: needinfo?(amarchesini)
Nevermind, I see the STR are in comment 9 and 12.  Kamil, can you check the STR on a build of 49 that has this backed out?  You may have to wait a couple days, as I'm not sure the build has been generated yet.
Flags: needinfo?(amarchesini)
(In reply to Tanvi Vyas [:tanvi] from comment #34)
> Nevermind, I see the STR are in comment 9 and 12.  Kamil, can you check the
> STR on a build of 49 that has this backed out?  You may have to wait a
> couple days, as I'm not sure the build has been generated yet.

Sounds good! The release drivers mentioned that fx49 RC3 will GTB either tomorrow or Monday/Tuesday. I'll add an update once the build becomes available.. Leaving the needinfo for myself.
I went through the STR from comment#9 and comment#12 using FX49 RC3 [1] and I couldn't reproduce the original crash. I used the following platforms:

* macOS 10.11.6 x64 -> couldn't reproduce crash
* Ubuntu 16.04.1 LTS x64 - couldn't reproduce the crash
* Win 10 x64 - couldn't reproduce the crash

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1280105#c9
Flags: needinfo?(kjozwiak)
You need to log in before you can comment on or make changes to this bug.