Closed Bug 599584 Opened 9 years ago Closed 9 years ago

Snapple.com pages keep reloading / refreshing due to meta refresh in innerHTML content

Categories

(Core :: HTML: Parser, defect, P2, critical)

defect

Tracking

()

VERIFIED FIXED
mozilla2.0b7
Tracking Status
blocking2.0 --- betaN+

People

(Reporter: stephend, Assigned: hsivonen)

References

()

Details

(Keywords: regression)

Attachments

(2 files)

Build: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0b7pre) Gecko/20100924 Firefox/4.0b7pre

Summary: Snapple.com pages keep reloading / refreshing (HTTP 301)

STR:

1. Load http://www.snapple.com/products/#/drink-type/diet/ using Minefield
2. Click on any drink

Expected:

Page loads

Actual:

Page continually reloads/refreshes/redirects
Attached file NSPR/HTTP log
Is this a regression?  If so, from when?
(In reply to comment #2)
> Is this a regression?  If so, from when?

Happens as early as Firefox 4 beta 1:  Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:2.0b1) Gecko/20100630 Firefox/4.0b1
OK, what about 3.6?
3.6.10 is fine on Linux, Windows, Mac.
Ok, do you have time to find a regression range?  Or should I?
(In reply to comment #6)
> Ok, do you have time to find a regression range?  Or should I?

I'm heading out for most of the rest of the day, and Monday's pretty booked, so if you have the time, please do -- I suspect, of course, Electrolosys, but that's a huge place to start looking.
Regression range:

http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=83c887dff0da&tochange=d6bb0f9e9519

Disabling the HTML5 parser fixes the problem on trunk.
blocking2.0: --- → ?
Component: Networking → HTML: Parser
QA Contact: networking → parser
Oh, and a URI that shows the problem directly:

  http://www.snapple.com/products/#/diet-peach-green-tea

This looks like the problem stack to me:

#0  nsDocShell::RefreshURI (this=0x1218ba930, aURI=0x126ea9010, aDelay=0, aRepeat=0, aMetaRefresh=1) at /Users/bzbarsky/mozilla/vanilla/mozilla/docshell/base/nsDocShell.cpp:5142
#1  0x0000000100fa14a3 in nsDocShell::SetupRefreshURIFromHeader (this=0x1218ba930, aBaseURI=0x126c3f680, aHeader=@0x7fff5fbfa620) at /Users/bzbarsky/mozilla/vanilla/mozilla/docshell/base/nsDocShell.cpp:5515
#2  0x0000000100632c39 in nsDocument::SetHeaderData (this=0x106b22600, aHeaderField=0x10650c2b0, aData=@0x7fff5fbfa970) at /Users/bzbarsky/mozilla/vanilla/mozilla/content/base/src/nsDocument.cpp:3222
#3  0x00000001005d71f9 in nsContentSink::ProcessHeaderData (this=0x126cb6ab0, aHeader=0x10650c2b0, aValue=@0x7fff5fbfa970, aContent=0x126ea8fa0) at /Users/bzbarsky/mozilla/vanilla/mozilla/content/base/src/nsContentSink.cpp:485
#4  0x00000001005d79c6 in nsContentSink::ProcessMETATag (this=0x126cb6ab0, aContent=0x126ea8fa0) at /Users/bzbarsky/mozilla/vanilla/mozilla/content/base/src/nsContentSink.cpp:841
#5  0x0000000100b0da61 in nsHtml5TreeOperation::Perform (this=0x12be9b1e8, aBuilder=0x126cb6ab0, aScriptElement=0x7fff5fbfb098) at /Users/bzbarsky/mozilla/vanilla/mozilla/parser/html/nsHtml5TreeOperation.cpp:680
#6  0x0000000100b11062 in nsHtml5TreeOpExecutor::FlushDocumentWrite (this=0x126cb6ab0) at /Users/bzbarsky/mozilla/vanilla/mozilla/parser/html/nsHtml5TreeOpExecutor.cpp:590
#7  0x0000000100ab7013 in nsHtml5Parser::ParseFragment (this=0x126cb69f0, aSourceBuffer=@0x7fff5fbfb2f0, aTargetNode=0x126a72360, aContextLocalName=0x106506490, aContextNamespace=3, aQuirks=0) at /Users/bzbarsky/mozilla/vanilla/mozilla/parser/html/nsHtml5Parser.cpp:477
#8  0x000000010076e0e9 in nsGenericHTMLElement::SetInnerHTML (this=0x126a72360, aInnerHTML=@0x7fff5fbfb2f0) at /Users/bzbarsky/mozilla/vanilla/mozilla/content/html/content/src/nsGenericHTMLElement.cpp:749

We probably shouldn't be doing that from under the innerHTML setter.
Unless there's a pending document.write that actually flushes out the meta tag or something?  In any case, this stack keeps being hit over and over.
The string passed to the innerHTML setter seems to start with:

  "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtm"

and in particular comes from an XHR result...  and does include a meta refresh.

Just breakpointing on SetInnerHTML and then dumping the JS stack when it's hit will show the full string, since it's an argument to one of the jquery methods on the stack.
Summary: Snapple.com pages keep reloading / refreshing (HTTP 301) → Snapple.com pages keep reloading / refreshing due to meta refresh in innerHTML content
Priority: -- → P2
Assignee: nobody → hsivonen
blocking2.0: ? → betaN+
AFAICT, not showing *any* <meta>s to nsContentSink is the fix that replicates the old parser most closely.

Filed bug 603306, because I suspect <link> handling is similarly wrong.
Attachment #482222 - Flags: review?(bzbarsky)
Status: NEW → ASSIGNED
> AFAICT, not showing *any* <meta>s to nsContentSink is the fix that replicates
> the old parser most closely.

Yes.

Does this need a spec change?
Comment on attachment 482222 [details] [diff] [review]
Don't show <meta>s to nsContentSink when parsing a fragment

Fix the {BUGNUMBER} thing in the <a>?

I wish we had a way to not wait for 1500 ms here, but I'm not seeing one.  :(
Attachment #482222 - Flags: review?(bzbarsky) → review+
(In reply to comment #13)
> > AFAICT, not showing *any* <meta>s to nsContentSink is the fix that replicates
> > the old parser most closely.
> 
> Yes.
> 
> Does this need a spec change?

Yes. Filed http://www.w3.org/Bugs/Public/show_bug.cgi?id=11013

(In reply to comment #14)
> Fix the {BUGNUMBER} thing in the <a>?

OK. Thanks.
This is a pretty unfortunate change. I'd really like to move the refresh logic out of the parser and into the DOM, like we've done with many other things.

Do we know of any other sites that are breaking over this? If this is the only one I'd prefer to contact them and let them know how to fix it and when we're shipping.
(In reply to comment #16)
> This is a pretty unfortunate change. I'd really like to move the refresh logic
> out of the parser and into the DOM, like we've done with many other things.
> 
> Do we know of any other sites that are breaking over this? If this is the only
> one I'd prefer to contact them and let them know how to fix it and when we're
> shipping.

Is that really a worthwhile endeavor when WebKit and IE8 (didn't test 9) don't seem to put the Refresh logic on the DOM side, either:
http://software.hixie.ch/utilities/js/live-dom-viewer/saved/666

(I don't know of other sites at this time.)
Yes, I do think that moving things from the parser to the DOM is a goal worth pursuing. Not only does that give us for free an API to both schedule new autorefreshes as well as cancel existing ones, it also means we only have to have one implementation (the DOM), rather than the 3 or so that we have now in the various sinks.
I landed this anyway, since this was a reviewed blocker assigned to me and the state m-c was in was clearly bogus:
http://hg.mozilla.org/mozilla-central/rev/e8c3b085a9c8

I think the move to the DOM side merits a separate bug, though I think it won't fly, since the behavior would be incompatible with Firefox 3.6, WebKit and IE.
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla2.0b8
Verified FIXED using Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0b8pre) Gecko/20101013 Firefox/4.0b8pre.

Thanks, all!
Status: RESOLVED → VERIFIED
Target Milestone: mozilla2.0b8 → mozilla2.0b7
You need to log in before you can comment on or make changes to this bug.