Closed Bug 1362462 Opened 3 years ago Closed 3 years ago

Dragging a tab into a new window sometimes fails, destroying the browser in the window and any data.

Categories

(Firefox :: Tabbed Browser, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
Firefox 55
Tracking Status
firefox55 + fixed

People

(Reporter: Nika, Assigned: kmag)

References

Details

Attachments

(3 files)

Occasionally when I drag a tab into a new window on nightly the entirety of the tab's content is destroyed, and I lose any form data I was working on.

I've seen this happen on both OS X and Linux.

I have only seen this occur in the drag into empty space -> create window case, although it may also occur in the adoptTab case. The tab is removed from the original window, and then a new tab with the correct URL in the URL bar, is created in the new window, with no session history, and what appears to be an empty content frame.
No errors in the browser console? What URLs is this happening with? Could it be specific to file: ones or something?
Component: General → Tabbed Browser
Flags: needinfo?(michael)
(In reply to :Gijs from comment #1)
> No errors in the browser console? What URLs is this happening with? Could it
> be specific to file: ones or something?

This occurred to me with a Github issue tab. Not specific to the file content process I think.

Hasn't occurred to me in a while (although I have been much more cautious with dragging tabs into new windows now than usual). I'll look at the browser chrome console next time it happens to me.
Flags: needinfo?(michael)
This just happened to me again, when dragging a google docs document which was in the process of loading into a new tab.

I didn't notice it happened right away (I opened a new tab in the window & started browsing before I noticed), so there may be some extra clutter in the console logs.

I opened the Browser Toolbox for the new tab, I've attached a text file with what was in it.

I also used the inspector to get the source for the xul:browser element in the tab, and got this:

<xul:browser xmlns:xul="http://www.mozilla.org/keymaster/gatekeeper/there.is.only.xul" anonid="initialBrowser" type="content" message="true" messagemanagergroup="browsers" xbl:inherits="tooltip=contenttooltip,contextmenu=contentcontextmenu,autocompletepopup,selectmenulist,datetimepicker" xmlns:xbl="http://www.mozilla.org/xbl" tooltip="aHTMLTooltip" autocompletepopup="PopupAutoComplete" datetimepicker="DateTimePickerPanel" selectmenulist="ContentSelectDropdown" contextmenu="contentAreaContextMenu" clickthrough="never" autoscrollpopup="autoscroller" remote="true" remoteType="web" primary="true"/>

which seemed pretty normal to me.

The tab looked like:

<tab xmlns="http://www.mozilla.org/keymaster/gatekeeper/there.is.only.xul" class="tabbrowser-tab" fadein="true" context="tabContextMenu" linkedpanel="panel-287-1" first-tab="true" label="(omitted)" onerror="this.removeAttribute('image');" image="(omitted)" iconLoadingPrincipal="(omitted)" busy="true" selected="true" first-visible-tab="true" visuallyselected="true"/>
Flags: needinfo?(gijskruitbosch+bugs)
Hm. I don't know - the console errors all refer to one or more add-ons you have. It's hard to tell if they're related, and if they are, how. You could try disabling the add-ons in question (if you click the 'contentscript.js' reference you should be able to find out which add-on that's from). Maybe Mike has a better idea of what might be happening here?
Flags: needinfo?(gijskruitbosch+bugs) → needinfo?(mconley)
(In reply to :Gijs from comment #4)
> Hm. I don't know - the console errors all refer to one or more add-ons you
> have. It's hard to tell if they're related, and if they are, how. You could
> try disabling the add-ons in question (if you click the 'contentscript.js'
> reference you should be able to find out which add-on that's from). Maybe
> Mike has a better idea of what might be happening here?

I am currently running the following addons: 
Close Other Windows (webextension) - https://addons.mozilla.org/en-US/firefox/addon/close-other-windows/
Gecko Profiler (webextension) - https://perf-html.io
Min Vid (test pilot) - https://testpilot.firefox.com/experiments/min-vid
Test Pilot (test pilot) - https://testpilot.firefox.com/
uBlock Origin (legacy) - https://addons.mozilla.org/en-US/firefox/addon/ublock-origin/
Wayback Machine (webextension) - https://addons.mozilla.org/en-US/firefox/addon/wayback-machine_new/

I looked around and discovered that there is a contentscript.js in uBlock Origin here: https://github.com/gorhill/uBlock/blob/139d97179f82e73025d9d498ab1dd710b8325bae/src/js/contentscript.js#L1632. The lines are a bit different, I found this line by looking at the version in dxr (https://dxr.mozilla.org/addons/source/addons/607454/js/contentscript.js#1640).

I doubt that that is the cause of my issues, but it is possible.
I asked Jeff who was also running into this problem, and he was not running uBlock Origin on the profile which he ran into the problem on. I don't think it's the cause of this issue.
I have one extension (no addons): the gecko profiler. I'm seeing this on Windows in Nightly, too. Probably for ~1 week.
[Tracking Requested - why for this release]:

This bug that causes users to lose data in a very visible and easy-to-blame-on-firefox way. It seems to be happening more frequently to me, and occurring mostly when I have a textarea focused or a form which I am trying to fill. It's occurred to multiple people in the Toronto office on multiple platforms. It doesn't seem to be addon related either.
tracking for 55 based on comment 8.
Attached file blanknewdocument.html
I've found reliable STR and attached the document - currently debugging to figure out what the problem is.

It appears as though the problem is caused because this call into platform returns NS_ERROR_NOT_IMPLEMENTED http://searchfox.org/mozilla-central/rev/24c443a440104dabd9447608bd41b8766e8fc2f5/toolkit/content/widgets/browser.xml#1393 

This error is then gobbled up by the try {} catch {} and ignored - meaning that we report no error to the user, but silently trash their data. We should almost certainly capture that error, and probably respond to it by refusing to perform the swap, as if we don't then we destroy the user's data.

This internally is being returned by nsSubDocumentFrame::BeginSwapDocShells which is being called from http://searchfox.org/mozilla-central/rev/24c443a440104dabd9447608bd41b8766e8fc2f5/dom/base/nsFrameLoader.cpp#1431

Namely, the problem is that mDidCreateDoc is not true on the `this` object, which I believe is the origin frameloader's nsSubDocumentFrame (though I may have gotten the objects mixed up).

It seems as though documents created within the content process (for example through link clicks) don't correctly set this property on their nsSubDocumentFrame? It should be possible to get a regression range from the STR which I've attached.

In the parent process I get the warning when in a debug build:
[Parent 26809] WARNING: [nsFrameLoader] ReallyStartLoadingInternal tried but couldn't show remote browser. 

which is emitted from http://searchfox.org/mozilla-central/rev/24c443a440104dabd9447608bd41b8766e8fc2f5/dom/base/nsFrameLoader.cpp#785

This seems to be caused by http://searchfox.org/mozilla-central/rev/24c443a440104dabd9447608bd41b8766e8fc2f5/dom/base/nsFrameLoader.cpp#1274 which returns null because mFrameLoader on the RenderFrameParent is null, which was changed by bug 1353060, so I'm inclined to mark that bug as the cause.
Blocks: 1353060
I've done some mozregression work, and the STR which I attached in comment 10 are definitely not working in 2017-05-10 and are working in 2017-05-12 - which inclines me to think that bug 1353060 is the culprit.

NOTE: That bug was landed only 8 days ago, while this bug was filed 14 days ago. I think there is another issue behind this bug, but it got much worse due to bug 1353060 in the last 8 days.

See also bug 1366103 which mentions the large number of occurrences of this warning.
Flags: needinfo?(kmaglione+bmo)
I believe this is the same issue as bug 1362378.

Although it's interesting to know that this issue predates bug 1353060.
Flags: needinfo?(kmaglione+bmo)
(In reply to Kris Maglione [:kmag] (busy; behind on reviews) from comment #12)
> I believe this is the same issue as bug 1362378.
> 
> Although it's interesting to know that this issue predates bug 1353060.

I believe that there are multiple causes, and bug 1353060 is just one of them, that also happens to currently be the easiest to reproduce.
Hm. So, it looks like there are cases where we get to nsFrameLoader:ShowRemoteFrame when the RenderFrameParent doesn't have its mFrameLoader property set yet. In those cases, it's created from AllocPRenderFrameParent with a null frameloader, and presumably one gets attached later.

With the previous check in ShowRemoteFrame, we checked the layer manager for the frameloader's current document, rather than that the RenderFrameParent could find an acceptable layer manager, so the check still passed. And presumably in many/most cases, things wound up working out anyway.

But this seems wrong. We should find a way to make sure the RFP is connected to the right frameloader when we try to show it.
(In reply to :Gijs from comment #4)
> Mike has a better idea of what might be happening here?

Sounds like mystor has found the big pieces here. Clearing ni?.
Flags: needinfo?(mconley)
This patch also fixes bug 1362378.

I still think something is flaky here, though. We generally shouldn't have a TabParent or FrameLoader for a frame element that hasn't been inserted into the document yet, so I think there's a timing issue somewhere that should be addressed. But this patch at least gives us the behavior we had prior to bug 1353060, so I think it's a reasonable short term solution.
Comment on attachment 8872452 [details]
Bug 1362462: Fallback to main document widget for frames which haven't been inserted yet.

https://reviewboard.mozilla.org/r/143952/#review148424

These fallbacks seem okay - though I agree, it seems pretty weird to have TabParent/FrameLoader when we're not in the DOM...

(In reply to Kris Maglione [:kmag] (busy; behind on reviews) from comment #17)
> I still think something is flaky here, though. We generally shouldn't have a
> TabParent or FrameLoader for a frame element that hasn't been inserted into
> the document yet, so I think there's a timing issue somewhere that should be
> addressed.

Sounds like there's a follow-up bug to be filed.
Attachment #8872452 - Flags: review?(mconley) → review+
(In reply to Mike Conley (:mconley) from comment #18)
> (In reply to Kris Maglione [:kmag] (busy; behind on reviews) from comment #17)
> > I still think something is flaky here, though. We generally shouldn't have a
> > TabParent or FrameLoader for a frame element that hasn't been inserted into
> > the document yet, so I think there's a timing issue somewhere that should be
> > addressed.
> 
> Sounds like there's a follow-up bug to be filed.

Yeah, I need to investigate some more. I'll file a follow-up when I have more details about what exactly is happening.
Assignee: nobody → kmaglione+bmo
Pushed by maglione.k@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/63cc19a2300f
Fallback to main document widget for frames which haven't been inserted yet. r=mconley
https://hg.mozilla.org/mozilla-central/rev/63cc19a2300f
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 55
Is this something we may want to uplift to 55? It looks like bug 1362378 may be a duplicate (and it's marked as affected for 55)
Flags: needinfo?(kmaglione+bmo)
Nevermind. Looks like this was fixed in 55/56 already!
Flags: needinfo?(kmaglione+bmo)
You need to log in before you can comment on or make changes to this bug.