Closed Bug 462076 Opened 11 years ago Closed 9 years ago

Dynamically inserted iframes on refresh sometimes trade places

Categories

(Core :: Document Navigation, defect, major)

defect
Not set
major

Tracking

()

RESOLVED FIXED
Tracking Status
blocking2.0 --- betaN+

People

(Reporter: erinelf, Assigned: smaug)

References

(Depends on 1 open bug, Blocks 2 open bugs)

Details

Attachments

(5 files, 5 obsolete files)

User-Agent:       Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.4; en-US; rv:1.9.0.3) Gecko/2008092414 Firefox/3.0.3
Build Identifier: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.4; en-US; rv:1.9.0.3) Gecko/2008092414 Firefox/3.0.3

On a page with several dynamically inserted iframes, on refresh of the page, the src attributes of the frames do not match the value of contentWindow.location, causing the frames to display the wrong content

Reproducible: Always

Steps to Reproduce:
1. Unzip the attached archive to your filesystem
2. Load index.html from the filesystem
3. Refresh the page

Actual Results:  
At the top of the page is a log describing what happened on the page.  It prints errors when it detects the errant values. On refresh, you will notice the errors.  The iframes load the wrong content.  

If it doesn't happen on the first refresh, refresh again until the behavior occurs.  This is usually very fast.

Expected Results:  
The iframes' locations should be identical to their src attributes, and the correct content should be loaded.

The randomization in the test case appears to make this problem happen with much greater frequency.  We see this in our webapp when we can't predict the iframe load order on the page.
Attached file Test case
This is the test case to reproduce the bug.
Component: Session Restore → General
QA Contact: session.restore → general
Olli was able to reproduce with this test case, so confirming. Erin: is Olli's modified reduced testcase included in that zip file?

I'm predicting that as more sites try to optimize for mobile browsers like the iPhone, this sort of pattern might become more common. See #2 in this article for details: 

http://code.flickr.com/blog/2008/10/27/lessons-learned-while-building-an-iphone-site/

and from bz-over-email:

> Fundamentally, it seems to me that the problem is that we add session
> history entries and docshells to their relevant trees in addition order,
> wheareas to make this testcase work we'd need to add them in DOM order.

taking a guess at -->Core::Document Navigation
Status: UNCONFIRMED → NEW
Component: General → Document Navigation
Ever confirmed: true
OS: Mac OS X → All
Product: Firefox → Core
QA Contact: general → docshell
Hardware: Macintosh → All
#2 in that article is only a problem for us if the dynamically grabbed content includes iframes in it.

And if so, then us not removing history entries for iframes that are removed from the DOM would be a much bigger problem than this bug...
Yes, confirmed, I included Olli's modification to my index.html in the attached
testcase, as well as my original firebug-dependent index-firebug.html.
re comment #3 - ok, please ignore my red herring there; didn't mean to conflate!
Attached patch wip (obsolete) — Splinter Review
This is just a wip, but fixes the testcase.
I need to think and test how this works with pages in bfcache vs. not in bfcache etc.
Comment on attachment 345699 [details] [diff] [review]
wip

This is basically the same as https://bugzilla.mozilla.org/attachment.cgi?id=317292&action=edit which never landed because at that time less risky patch was more appropriate.
Comment on attachment 345699 [details] [diff] [review]
wip

This unfortunately makes our history data way too large.
A different approach under investigation...
Flags: blocking1.9.1?
Erin's nominated this for blocking, which sounds sensible, considering that attachment 317292 [details] [diff] [review] (the one that didn't land in favour of a less risky approach) was for bug 426646 which was a 1.9 blocker and identified as a regression. While bug 426646 is marked FIXED, looks like it wasn't completely FIXED, so re-adding the regression keyword here and suggesting that we do consider this a blocking issue.

Put more simply, and in Erin's words: this is "kinda really bad."
Keywords: regression
But we have had pretty similar session history behavior for ages.
So what part of this is a regression? Bug 426646 was a bit different thing, AFAIK.
Yeah, if there's actually a regression here, please file a separate bug with a testcase that demonstrates the regression.  As far as I know, this code has behaved the way it does going back to Gecko 1.7 at least (and possibly earlier).
I may have over interpreted what Smaug was saying in comment 7.

If it's truly the case that this behaviour is unaltered since Gecko 1.7, then this isn't a regression (Erin: can you confirm that this behaves the same in Firefox 2 as it does in Firefox 3?), it's just kind of crappy that if you dynamically insert an iframe and the page refreshes, those iframes can switch places.
Keywords: regression
Firefox 2 exhibits the same behavior.
Just to note, firebug may have an effect on this problem as it seems to have on: https://bugzilla.mozilla.org/show_bug.cgi?id=464064

Make sure to test out in firefox safemode with ad-ons disabled.
I tested that before I filed the bug (I am ordinarily suspicious of firebug) -- it definitely happens in a vanilla installation of Firefox.
Flags: blocking1.9.1? → blocking1.9.1-
> "it's just kind of **** that if you dynamically insert an iframe and the page refreshes, those iframes can switch places"

It's even worse, the test case on bug 473436 (which attachment 345699 [details] [diff] [review] fixes for me) is a repeatable bug where one (or more) iframe(s) is (are) overwritten with the contents of a single iframe - thus you lose all but one iframe on the page (the 'winner' overwrites all dynamically created iframes).
This still exists in 3.0.10
Blocks: 492947
Attached file Simplified testcase
Simplified static testcase which only needs a shift+reload to trigger the assertion.
Assignee: nobody → Olli.Pettay
Status: NEW → ASSIGNED
This is probably a dup of bug 363840 which has in turn been marked a dup of bug 279048
Blocks: 508537
Depends on: 527897
Attached patch v12 (obsolete) — Splinter Review
Backup copy of the current patch.

Needs still automatic tests, lots of them. I need to mochitestify my
current tests.
Should fix this and bug 307241, bug 314457, bug 344216, bug 396836,
bug 464064 and bug 508537. And some non-filed bugs.
Most of the other bugs are *not* dupes of this one.

Tries to copy IE's behavior when possible (since IE has the least buggy session
history atm). But doesn't copy IE's bugs, like when going back to a with dynamically added iframes, web pages get loaded to wrong iframes.
(Webkit's session history is generally worse than Gecko's and Opera has some bugs too.)
Attached patch v13 (obsolete) — Splinter Review
Attachment #421057 - Attachment is obsolete: true
Attached patch v14 (obsolete) — Splinter Review
Doesn't break the tests for 255820 and 445004
Attachment #421062 - Attachment is obsolete: true
Attached patch v16 + tests (obsolete) — Splinter Review
This might be ready for the first review.
Will ask after tryserver tests.
Attachment #345699 - Attachment is obsolete: true
Attachment #421138 - Attachment is obsolete: true
Ok, seems like there is at least one left-over
//NS_ENSURE_TRUE(aChild, NS_ERROR_FAILURE);
That should be just removed.
Attachment #421284 - Flags: review?(bzbarsky)
Comment on attachment 421284 [details] [diff] [review]
v16 + tests

Let's see what Boris says about this approach;
the idea is to handle dynamic changes a bit differently. It *seems* that others do that too.

nsSHEntry keeps a list of child entries. The start of the list is for static entries and dynamic entries are appended to the end.
Dynamic entries don't get used for (re)loading when their parent is (re)loaded. 

To detect "dynamic changes", I added InDocumentWriteOrInnerHTML (so that dynamic parser usage can be detected) and InUnlinkOrDeletion() for document destroying.

onload handling in
nsDocShell::OnStateChangeis there because of pretty similar case in nsDocShell::LoadURI. OnStateChange just handles document.write.

nsSHistory::CompareFrames is changed so that something is loaded to an nsIDocshell only if docshell/historyIDs match.

To prevent exponential nsSHEntry count when there are dynamic changes, history tree needs to be cleaned up when there are useless entries. That is the reason for RemoveEntries.
If or when bug 254144 is fixed, unbinding (i)frame may need to store the related
nsSHEntries somewhere and restore them if (i)frame is bound back to document.

Shouldn't be too difficult to implement.
Boris, do you want me to update the patch? Though the code shouldn't have been
changed a lot.
Don't worry about it; I'll just review what's here.
OK, so I've been trying to understand the InDocumentWriteOrInnerHTML flag.  Is the idea to flag frames that were "in the original source"?  A problem with the implementation in the patch is that something like:

  document.write("<" + "script src='something'></" + "script>");
  document.write("<iframe></iframe>");

won't create the HTMLFrameElement for the <iframe> until the <script> is loaded.  At that point, InDocumentWriteOrInnerHTML will be false.  Is that ok?  That is, is this meant to be a very simple heuristic that only catches certain cases?

There are of course other issues involving partial tag writes, etc, but those are even more esoteric than the above....
Aha, seems like I need to hack parser a bit, so that it could set the
flag when it knows when it is processing document.write or something.
Depends on: 569538
Comment on attachment 421284 [details] [diff] [review]
v16 + tests

I'm updating this patch to use bug 569538.
Attachment #421284 - Attachment is obsolete: true
Attachment #451002 - Flags: review?(bzbarsky)
Attachment #421284 - Flags: review?(bzbarsky)
blocking2.0: --- → ?
Blocks: 527897
No longer depends on: 527897
blocking2.0: ? → betaN+
Comment on attachment 451002 [details] [diff] [review]
using the new FromParser flags

>+++ b/content/base/src/nsDocument.cpp

Do we want to also set mInUnlinkOrDeletion when we're unbinding our kids in ResetToURI?

And around the DestroyContent stuff in nsDocument::Destroy?

>+++ b/content/base/src/nsObjectLoadingContent.h

Might be good to document the meaning of mNetworkCreated (and same in the frameloader.

>+++ b/content/html/content/src/nsGenericHTMLElement.h
>+  PRPackedBool            mNetworkCreated;

And here.

In particular, the fact that it can become false after being true is non-obvious.

>+++ b/docshell/base/nsDocShell.cpp

I wonder whether there's a sane way to refactor the HasDynamicallyAddedChild(&dynamic) bits.... probably not.  :(

Plus it's slightly diffeent in the two callsites (one checks oshe, one does not).  Document the difference?  And document that the child offset will be -1 in various cases (e.g. if any other sibling is dynamic)?

>+nsDocShell::RemoveFromSessionHistory()
>+    nsTArray<PRUint64> ids;

nsAutoTArray, maybe?

>+nsDocShell::ClearFrameHistory(nsISHEntry* aEntry)
>+  nsTArray<PRUint64> ids;

Auto array here too.

>@@ -5575,16 +5667,36 @@ nsDocShell::OnStateChange(nsIWebProgress

Document why the change here is needed?

>@@ -9022,16 +9138,38 @@ nsDocShell::OnNewURI(nsIURI * aURI, nsIC
>+        // Since we're shift-reloading, clear all the sub frame history.

force-reloading.

>+        ClearFrameHistory(mLSHE);
>+        ClearFrameHistory(mOSHE);

Does it make sense to use the same index into overall session history for both of those?  I guess so.....

>+++ b/docshell/base/nsDocShell.h
>+    PRPackedBool               mDynamicallyCreated;

Document?

>+++ b/docshell/base/nsIDocShell.idl
>+   * The ID of the dochshell in the session history.

"docshell".

Maybe put this at the end of the interface?

>+++ b/docshell/shistory/src/nsSHEntry.cpp
>@@ -107,16 +107,17 @@ nsSHEntry::nsSHEntry()

Do you need to init mDocShellID here?

> nsSHEntry::AddChild(nsISHEntry * aChild, PRInt32 aOffset)
> {
>-  NS_ENSURE_TRUE(aChild, NS_ERROR_FAILURE);
>+  //NS_ENSURE_TRUE(aChild, NS_ERROR_FAILURE);

Just delete instead.

>+RemoveFromSessionHistoryContainer(nsISHContainer* aContainer,
>+        aContainer->RemoveChild(child);

Need to do --i, --childCount here or something, no?  Except that RemoveChild doesn't always change the array length...

The safe thing to do here would be to iterate the array backwards, I think.

>+nsSHistory::RemoveEntries(nsTArray<PRUint64>& aIDs, PRInt32 aStartIndex)
>+{
>+  PRInt32 index = aStartIndex;
>+  while(index >= 0 && RemoveChildEntries(this, --index, aIDs));
>+  index = aStartIndex;
>+  while(index >= 0 && RemoveChildEntries(this, index++, aIDs));
>+}

I don't understand this method.  If aIDs has multiple IDs and we stop as soon as we find some id to remove... then how is this supposed to work?

>+nsSHistory::RemoveDynEntries(PRInt32 aOldIndex, PRInt32 aNewIndex)
>+        for (PRUint32 i = 0; i < originalDynDocShellIDs.Length(); ++i) {
>+          if (!newDynDocShellIDs.Contains(originalDynDocShellIDs[i])) {

This could conceivably get really slow.  Worth sorting the arrays before comparing, maybe?

>+nsSHistory::LoadNextPossibleEntry(PRInt32 aNewIndex, long aLoadType, PRUint32 aHistCmd)
>+    return LoadEntry(aNewIndex - 1, aLoadType, aHistCmd);
>+  } else if (aNewIndex > mIndex) {

Nix the else after return?

>@@ -1213,43 +1341,59 @@ nsSHistory::LoadEntry(PRInt32 aIndex, lo
>+      PRUint32 prevID = 0;
>+      PRUint32 nextID = 0;
>+      prevEntry->GetID(&prevID);
>+      nextEntry->GetID(&nextID);
>+      if (prevID == nextID) {
>+        return LoadNextPossibleEntry(aIndex, aLoadType, aHistCmd);

Why is this check and call here?  Please document, at least!  I have no idea what this is doing and why.

r=bzbarsky with the above issues dealt with.  I'm really really sorry this took so long.  :(
Attachment #451002 - Flags: review?(bzbarsky) → review+
I'll approve this once the nits are addressed ...
(why would this need approval?)
(In reply to comment #36)
> Comment on attachment 451002 [details] [diff] [review]
> using the new FromParser flags
> 
> >+++ b/content/base/src/nsDocument.cpp
> 
> Do we want to also set mInUnlinkOrDeletion when we're unbinding our kids in
> ResetToURI?
This should be ok.

> 
> And around the DestroyContent stuff in nsDocument::Destroy?
And definitely here.
(In reply to comment #38)
> (why would this need approval?)

(In reply to comment #39)
> It wouldn't.

Oh, right. :)

I'm not sure that we want to land this until after we freeze for Beta 4, though - feels like a pretty invasive and risky change that could benefit from nightly testing before we ship to half a million beta users. Agree?
(In reply to comment #36)

> The safe thing to do here would be to iterate the array backwards, I think.
I do this.


> >+nsSHistory::RemoveEntries(nsTArray<PRUint64>& aIDs, PRInt32 aStartIndex)
> >+{
> >+  PRInt32 index = aStartIndex;
> >+  while(index >= 0 && RemoveChildEntries(this, --index, aIDs));
> >+  index = aStartIndex;
> >+  while(index >= 0 && RemoveChildEntries(this, index++, aIDs));
> >+}
> 
> I don't understand this method.  If aIDs has multiple IDs and we stop as soon
> as we find some id to remove... then how is this supposed to work?
We loop until we find a history entry from which we can't remove any of the
IDs.
In other words, we iterate history first to back and remove all the IDs and
then
forward and remove all the entries. And in both cases we stop when
we can't find any IDs from the current session history index.

> >+nsSHistory::RemoveDynEntries(PRInt32 aOldIndex, PRInt32 aNewIndex)
> >+        for (PRUint32 i = 0; i < originalDynDocShellIDs.Length(); ++i) {
> >+          if (!newDynDocShellIDs.Contains(originalDynDocShellIDs[i])) {
> 
> This could conceivably get really slow.  Worth sorting the arrays before
> comparing, maybe?
It could be slow if there are lots of dynamically added frames.
I'll write some test and profile this a bit.
If it is a perf problem, I'll file a followup and patch.

> (In reply to comment #39)
> I'm not sure that we want to land this until after we freeze for Beta 4, though
> - feels like a pretty invasive and risky change that could benefit from nightly
> testing before we ship to half a million beta users. Agree?
I agree. I was and am going to land this asap the tree is open again *after* b4.
Attached patch pushed to m-cSplinter Review
http://hg.mozilla.org/mozilla-central/rev/353da09ea0dd
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Depends on: 588077
Hey guys, any chance this caused the new crash bug 588643?
Possible. I'd be surprised it this wouldn't cause regressions ;)
Depends on: 589207
Depends on: 588643
Did this more or less fix bug 293417?
Blocks: 293417
Depends on: 598345
Duplicate of this bug: 608382
Duplicate of this bug: 617369
Depends on: 618031
Depends on: 623829
Depends on: 632835
Depends on: 646036
Depends on: 697718
You need to log in before you can comment on or make changes to this bug.