Using history.pushState in the main page breaks history tracking for inner frames

RESOLVED FIXED in mozilla2.0b12

Status

()

defect
RESOLVED FIXED
9 years ago
8 years ago

People

(Reporter: mossop, Assigned: mossop)

Tracking

unspecified
mozilla2.0b12
x86
macOS
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +
in-litmus -

Firefox Tracking Flags

(blocking2.0 betaN+)

Details

(Whiteboard: [softblocker])

Attachments

(1 attachment, 4 obsolete attachments)

about:addons in Firefox is loaded in a <browser type=content> in the main window. Inside it there is a nested <browser type=content> where we load the Get Add-ons content from AMO. In bug 601143 I'm using a progress listener to ensure that inner browser never ends up loading from some other site. After a short period of use (possibly tied to GC?) my listener stops receiving onLocationChange events. It continues to receive onStateChange events though.

Currently I've failed to generate a simple testcase and for now I can workaround the problem but I'd like to at least reference a bug in my workaround and hopefully dig into it more when I have spare moments.
Dave, how do I go about reproducing this?
As I say I've failed to make a simple testcase however with the patch in bug 601143 applied and setting extensions.webservice.discoverURL to a url that has links to other sites.

I can reproduce by just browsing around a bit mostly but it seems to happen more frequently if you click a link to a different domain (which causes that code to cancel the load in onLocationChange, clicking to a different add-ons category and then back to get add-ons again will reload the initial page and then almost always onLocationChange never gets fired again.
Some kind of correlation with the problems in bug 591905 here.

I can reproduce this now. If you open the add-ons manager, select the extension pane (or any pane that isn't get add-ons) then close and re-open the add-ons manager and then select the get add-ons pane no onLocationChange event will ever be sent.

If on the other hand you open the add-ons manager, make sure get add-ons is selected and then close and open the add-ons manager then onLocationChange will be sent for every page navigation.

In those two cases navigating around in the browser causes different things to happen to the session history for the whole tab.
Blocks: 591905
Oh, this error is popping up for the broken case:

Error: [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIWebNavigation.sessionHistory]"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: chrome://global/content/bindings/browser.xml ::  :: line 657"  data: no]
Source File: chrome://global/content/bindings/browser.xml
Line: 664

Looks like it isn't wiring up sessionHistory when the browser is hidden or something?
Err ignore that last, was testing without disablehistory="true" for a moment. It always throws that error in that case.
In the case where onLocationChange isn't sent it seems to be because we're bailing out here: http://hg.mozilla.org/mozilla-central/annotate/3d49b553bc4b/docshell/base/nsDocShell.cpp#l1776
(In reply to comment #6)
> In the case where onLocationChange isn't sent it seems to be because we're
> bailing out here:
> http://hg.mozilla.org/mozilla-central/annotate/3d49b553bc4b/docshell/base/nsDocShell.cpp#l1776

In both cases isRoot is false but in the case where onLocationChange is fired isSubFrame is true
So why does isSubFrame end up false?
It looks like the history entry for the inner-browser isn't properly getting added to the parent history entry for the tab. I'm guessing something has decided that in the non-working case that the browser element is the top level (perhaps because at that point we cross the content/chrome type boundary?)

I've added some logging to dump out the history trees and when loading about:addons with Get Add-ons visible and navigating once in it I see something that seems sensible to me (each line is SHEntry address, url, isSubFrame):

**** SH Transaction #0
1b9245f0 about:addons (0)
	5bc7d60 about:blank (0)
**** SH Transaction #1
2104c1f0 about:addons (1)
	29351960 http://www.oxymoronical.com/ (1)
**** SH Transaction #2 (current)
22e83410 about:addons (1)
	21004a10 http://www.oxymoronical.com/blog/2011/01/Playing-with-windows-in-restartless-bootstrapped-extensions (1)
***********************

If I do it for the case where about:addons loads without Get Add-ons visible and I switch to it and then navigate to a page this is all the history contains:

**** SH Transaction #0
29103bb0 about:addons (0)
**** SH Transaction #1
2912c3a0 about:addons (0)
**** SH Transaction #2
29168870 about:addons (1)
**** SH Transaction #3 (current)
210e12d0 about:addons (1)
***********************

Additionally the SHEntry that it is testing for whether to dispatch onLocationChange or not is 21000180, not in the session history for the tab at all.
> (perhaps because at that point we cross the content/chrome type boundary?)

But we don't cross a docshell type boundary here, right?
(In reply to comment #10)
> > (perhaps because at that point we cross the content/chrome type boundary?)
> 
> But we don't cross a docshell type boundary here, right?

No I guess not. I guess is probably to do with the decks (there are two nested here). When the browser element is hidden by the decks history seems to go funky, this happens in the broken case above but also from the working case above changing to the extensions category (and so hiding the browser) changes the history to:

**** SH Transaction #0
256d49d0 about:addons (1)
**** SH Transaction #1 (current)
23fce070 about:addons (0)
***********************

This history seems broken, not only is the inner-browser missing again but the first entry as a subframe entry seems wrong to my eyes. The old history entries got removed by RemoveDynEntries and from then on onLocationChange stops firing again.
Sounds like Olli-territory!
Actually I think it is the act of calling pushState which throws away the child history states for the inner browser
Ok, I think pushState is creating a new top level SHEntry but it isn't adding a child SHEntry to it for the inner-browser, then when AddEntry is called on nsSHistory in RemoveDynEntries the current index has just the global entry so it goes back over all the previous history entries and removes their child entrys for the inner-browser.

I think this leaves the inner-browser using a SHEntry that is unlinked from the hierarchy and so navigating in it starts to behave strangely.
I guess this actually matches what happens for normal history. If you navigate a bit in a frameset page and then go somewhere new the history for all the subframes are thrown away because the frames are gone.
Posted patch broken testcase (obsolete) — Splinter Review
Adding cloning of child entries did seem to make the behaviour in about:addons pretty much what I would expect but then trying to make a simpler testcase for this reveals a different behaviour again so there is obviously more going on than I see yet.

This attachment has a mochitest and some logging of the history entries. It currently times out at the first attempt to go back but here is the simplified code run and logged history:

Test start:

**** SH Transaction #0 (current)
1f678e50 test_historyframes.html (0)

	22019180 data:text/html,<p%20id='text'>Start</p> (0)

***********************
**** Not firing onLocationChange for non-subframe entry 22019180

window.history.pushState("START", window.location);

**** SH Transaction #0
1f678e50 test_historyframes.html (0)

	22019180 data:text/html,<p%20id='text'>Start</p> (0)

**** SH Transaction #1 (current)
2201d760 test_historyframes.html (0)
"START"
***********************

load inner frame data:text/html,<p%20id='text'>Test1</p>
loads it into the first history entry, I would have expected a 3rd entry to be created

**** SH Transaction #0
1f678e50 test_historyframes.html (0)

	22019180 data:text/html,<p%20id='text'>Test1</p> (0)

**** SH Transaction #1 (current)
2201d760 test_historyframes.html (0)
"START"
***********************
**** Not firing onLocationChange for non-subframe entry 22019180

load inner frame data:text/html,<p%20id='text'>Test2</p>
creates a new history entry with no information about the frame's url

**** SH Transaction #0
1f678e50 test_historyframes.html (0)

	22019180 data:text/html,<p%20id='text'>Test1</p> (0)

**** SH Transaction #1
2201d760 test_historyframes.html (0)
"START"
**** SH Transaction #2 (current)
22031b40 test_historyframes.html (1)

***********************
**** Not firing onLocationChange for non-subframe entry 22031a20

window.history.back();
Jumps back to the very first history entry skipping over one

**** SH Transaction #0 (current)
1f678e50 test_historyframes.html (0)

	22019180 data:text/html,<p%20id='text'>Test1</p> (0)

**** SH Transaction #1
2201d760 test_historyframes.html (0)
"START"
**** SH Transaction #2
22031b40 test_historyframes.html (1)

***********************
Posted patch experiment patch (obsolete) — Splinter Review
This is the experiment at cloning child entries, seems to have a positive benefit in my tests but still think there is something not right.
Note to self: should make pushState do things as close as possible to anchor navigation.
I tried to look into what pushState was doing differently to anchor navigation. Looks like they do the same thing and indeed anchor navigation combined with inner frames seems busted. Simple STR:

1. Load the page http://download.oracle.com/javase/7/docs/api/
2. In the address bar append an anchor like http://download.oracle.com/javase/7/docs/api/#bar
3. Click one of the classes in the bottom left frame like "AbstractAction"

At this point the history looks like this:

**** SH Transaction #0
23a32290 http://download.oracle.com/javase/7/docs/api/ (0)
	1eda71f0 http://download.oracle.com/javase/7/docs/api/overview-frame.html (0)
	3131db0 http://download.oracle.com/javase/7/docs/api/allclasses-frame.html (0)
	23a64af0 http://download.oracle.com/javase/7/docs/api/overview-summary.html (0)
**** SH Transaction #1
256c16e0 http://download.oracle.com/javase/7/docs/api/#bar (0)
**** SH Transaction #2 (current)
2418d010 http://download.oracle.com/javase/7/docs/api/#bar (1)
***********************

I would expect back to undo the inner-frame navigation instead all that happens is the #bar gets dropped from the URL, the content stays the same. It jumps back to SH transaction 0. Forward/Back then flip you between SH transaction 0 and 1, the url changes but the content stays the same. At transaction 1 the forward button is enabled but clicking it does nothing.

If I leave out adding the anchor at step 2 then the history looks as I'd expect and back/forward do what I'd expect.

**** SH Transaction #0
20694a50 http://download.oracle.com/javase/7/docs/api/ (0)
	256d9240 http://download.oracle.com/javase/7/docs/api/overview-frame.html (0)
	1ed7ec50 http://download.oracle.com/javase/7/docs/api/allclasses-frame.html (0)
	26960680 http://download.oracle.com/javase/7/docs/api/overview-summary.html (1)
**** SH Transaction #1 (current)
1aa22d80 http://download.oracle.com/javase/7/docs/api/ (1)
	26adf270 http://download.oracle.com/javase/7/docs/api/overview-frame.html (1)
	241976d0 http://download.oracle.com/javase/7/docs/api/allclasses-frame.html (1)
	1aa31350 http://download.oracle.com/javase/7/docs/api/javax/swing/AbstractAction.html (1)
***********************
Updating the summary to be something closer to what is going on
Summary: Webprogress listener randomly stops receiving onLocationChange events for a browser inside content → Using anchor navigation or history.pushState in the main page breaks history tracking for inner frames
Posted patch WIP (obsolete) — Splinter Review
Ok here is an attempt to fix this. It currently only solves the problem for pushState but only because I need to work out how to detect that a load is an anchor navigation, I've tested and it will also solve that case too. I've simply added a new function to clone all the children of one shentry into another and then used then in AddToSessionHistory in the right case.

I also made it so the state for a session gets cloned on inner-frame navigations. This means if you pushState, then load inner frame, then pushState, going back properly sends popState for the old state because it now appears on the middle history entry.

These two things make my test case pass (mostly, see below) and with a separate patch for bug 507274 it makes the navigation in the add-ons manager work perfectly as near as I can tell.

Two issues with the testcase. The getURL function has to use the document's location property, for some reason the frame's src property doesn't get updated properly on history changes. This seems to happen regardless of pushState or my code changes or anything so it is either a different issue or something that isn't meant to happen.

The final failure is where the state turns out to be null rather than START. This is an odd one, it seem the first time the test sets the src property on the frame it loads the new url and replaces the existing history entry rather than creating a new one. Maybe this is intentional, I'm not sure.

I think this patch is mostly ready for review but I'd appreciate feedback and if there is a quick way to detect a load as being an anchor navigation then that'd save me time.
Attachment #506066 - Attachment is obsolete: true
Attachment #506067 - Attachment is obsolete: true
Attachment #507277 - Flags: feedback?(bzbarsky)
Ugh.  Now that I think about it, I think we may even have bugs about that anchor+iframe thing, though I can't find them right now.

The .src of an <iframe> is the value of the DOM "src" attribute.  It doesn't change as the document inside is navigated.  I'd prefer you use documentURI instead of location, though.

As far as detecting anchor navigation, the simplest way to handle that is probably to add an argument to SetNewURI and AddToSessionHistory and propagate that information down from the caller.  But I'd be tempted to just fix pushstate for 2.0, not change anchor behavior in the interests of keeping down regression risk, and file a folllowup to fix that part right after 2.0 ships.  Thoughts?
blocking2.0: --- → betaN+
Whiteboard: [softblocker]
Oh, and I need to stare at the code a bit more than I can manage before bedtime tonight to say anything intelligent about the core part of the patch here....
(In reply to comment #22)
> Ugh.  Now that I think about it, I think we may even have bugs about that
> anchor+iframe thing, though I can't find them right now.
> 
> The .src of an <iframe> is the value of the DOM "src" attribute.  It doesn't
> change as the document inside is navigated.  I'd prefer you use documentURI
> instead of location, though.

Makes sense, I'll switch that after you've taken a closer look.

> As far as detecting anchor navigation, the simplest way to handle that is
> probably to add an argument to SetNewURI and AddToSessionHistory and propagate
> that information down from the caller.  But I'd be tempted to just fix
> pushstate for 2.0, not change anchor behavior in the interests of keeping down
> regression risk, and file a folllowup to fix that part right after 2.0 ships. 
> Thoughts?

Yeah regressions were one thing that concerned me, makes total sense to leave the anchor navigation alone since I've just checked and that was broken in 3.6 too (ever so slightly different results though). The main bit I care about for 4.0 is getting thing working sanely in the add-ons manager.
(In reply to comment #21)
> The final failure is where the state turns out to be null rather than START.
> This is an odd one, it seem the first time the test sets the src property on
> the frame it loads the new url and replaces the existing history entry rather
> than creating a new one. Maybe this is intentional, I'm not sure.

I figured this out, because the test loads the new url into the frame during the main page's load event listener the history entry gets replaced (see http://mxr.mozilla.org/mozilla-central/source/docshell/base/nsDocShell.cpp#1296)
I couldn't find a specific bug for anchor navigation so for now I filed bug 629559 to track it.
Assignee: nobody → dtownsend
Summary: Using anchor navigation or history.pushState in the main page breaks history tracking for inner frames → Using history.pushState in the main page breaks history tracking for inner frames
Posted patch patch rev 1 (obsolete) — Splinter Review
I think this is ready for review now
Attachment #507277 - Attachment is obsolete: true
Attachment #507726 - Flags: review?(bzbarsky)
Attachment #507277 - Flags: feedback?(bzbarsky)
Whiteboard: [softblocker] → [softblocker][has patch][needs review bz]
Comment on attachment 507726 [details] [diff] [review]
patch rev 1

You should probably add some comments to CloneAndReplaceData about |cloneID| only being meaningful if |replaceEntry| is non-null.

>-        dest->SetIsSubFrame(PR_TRUE);
>+        if (replaceEntry)
>+            dest->SetIsSubFrame(PR_TRUE);

Why that change?

> +    gFrame.removeEventListener("load", arguments.callee, true);

Given that support of arguments.callee is being phased out, I'd rather not use it here.  Just create a new function?

  var f = function() { ...removeEventListener("load", f, true); }

and so forth.

It'd be nice to actually use the mochitest template for the test.  It includes links back to the bug, etc.

Is there a reason you're using capturing load listeners on gFrame?  Also, I'd prefer we run the callback async from the load listener in waitForLoad, because as you noticed history does weird things if you navigate inside onload....

>+  window.history.back();
>+  window.history.back();
>+
>+  is(gState, "START", "State should be correct");

Worth checking getURL and getContent there too.

The rest looks fine, but I'd like to understand the SetIsSubframe change.
Comment on attachment 507726 [details] [diff] [review]
patch rev 1

Will address some of the comments, also apparently the test is timing out on the tinderboxen but not locally :(
Attachment #507726 - Flags: review?(bzbarsky)
Posted patch patch rev 2Splinter Review
Needed a little extra work since I realised that the previous patch only covered the case of pushState in the top level document. If in an inner frame (as the test would run) it would still fail.

So instead of the previous changes to CloneAndReplace I've made it support the option to deep clone even the replaced entry and used that in both cases. Also made the same test run once inside a frame and once in its own window.

The one change is that now isSubFrame will be true for these history entries. I previously stopped that because I assumed it was only meant to be true when an inner frame had navigated, now I'm not sure so for now it will be true for any pushState
Attachment #507726 - Attachment is obsolete: true
Attachment #508121 - Flags: review?(bzbarsky)
Comment on attachment 508121 [details] [diff] [review]
patch rev 2

r=me, but the two if branches in CloneAndReplaceChild are looking more and more similar.  Perhaps a followup bug to refactor that a bit to share the code?
Attachment #508121 - Flags: review?(bzbarsky) → review+
Whiteboard: [softblocker][has patch][needs review bz] → [softblocker][has patch]
Landed: http://hg.mozilla.org/mozilla-central/rev/4b90fd0c1c4d

Filed bug 631269 on the cleanup
Flags: in-testsuite+
Flags: in-litmus-
Whiteboard: [softblocker][has patch] → [softblocker]
Target Milestone: --- → mozilla2.0b12
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Depends on: 685782
No longer depends on: 685782
You need to log in before you can comment on or make changes to this bug.