Closed Bug 585819 Opened 14 years ago Closed 14 years ago

Using createContextualFragment can cause Mozilla to create a new <body> inside the existing <body>

Categories

(Core :: DOM: Core & HTML, defect, P2)

defect

Tracking

()

RESOLVED FIXED
Tracking Status
blocking2.0 --- betaN+

People

(Reporter: linda167, Assigned: hsivonen)

Details

Attachments

(2 files, 1 obsolete file)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/533.4 (KHTML, like Gecko) Chrome/5.0.375.125 Safari/533.4
Build Identifier: 4.0b2

This is a regression in behavior in Firefox 4. Using the specific javascript calls in the attached repro page will produce another body in the existing body. 

Specifically, the line "oRange.setStartBefore(oBody);" seems to be the cause of this behavior. Although this may not be the best call to make in this situation, it should still not create another body element in Firefox. This is a regression in behavior, as Firefox 3.6.8 simply appends the element inside the existing body.

In a larger site, the existence of another body is causing the entire site to break.

Reproducible: Always

Steps to Reproduce:
Open attached repro page
1. Click on "Insert Document Fragment"
2. Examine the DOM
Actual Results:  
A new body is created inside the existing body. This causes the entire site to break.

Expected Results:  
No new body is created. The "Hello World" element should be appended in the existing body.
Attached file Repro page
Does turning off the HTML5 parser change the behavior?  From what I can tell, the HTML5 spec basically requires what you observe....
Could bug 512639 be this bug? See <http://code.google.com/p/fbug/issues/detail?id=2260#c10>. (Also, technically, the HTML5 spec doesn't require anything about createContextualFragment...)
createContextualFragment and innerHTML have always used the same algorithm (and code), and HTML5 defines the latter.  Furthermore, createContextualFragment and HTML paste use the same algorithm (by definition; that's the whole point of createContextualFragment).  HTML5 does specify paste; see section 10.4 of the specification.

No idea about bug 512639.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Additional Note:
This bug is causing Outlook Web Access 2010 to break completely after receiving a new mail notification.
blocking2.0: --- → ?
Chris, do we have contacts with the OWA folks?
Boris, the reporter is an OWA person.
Ah, that's not obvious from either the address or name or anything that was said in the bug...

The point is, this will likely need to be fixed in OWA, as far as I can tell.  We're certainly not planning to back out the HTML5 parser, and OWA was depending on a bug in the old parser which we fixed.
We've confirmed the behavior is correct?
It is as far as I can tell, at least on the testcase: the testcase sets the context point to be a child of the <html> tag, parses a string in that context (which creates a <body> around the string, just like it would if you just stuck the string into an <html> in a static HTML document, which is what createContextualFragment is supposed to do), then takes the resulting DOM fragment (now rooted at the new <body> and inserts it as a child of the <body>.  I have no idea why this used to behave differently with the old parser; if it did, the old parser was clearly buggy.

In general, if the page plans to insert the fragment as a child of the <body>, it should also put the context point to be inside the <body>...

I did ask Henri to also double-check that there isn't anything I'm missing here, of course.
Henri, please either confirm that this is invalid (seems likely), or if not, fix it for betaN.
Assignee: nobody → hsivonen
blocking2.0: ? → betaN+
Yes, this is the same issue as bug 512639.

createContextualFragment isn't part of any spec. (It's an early Geckoism created for Composer that was simply exposed to the Web and got cloned by other browsers.) However, the intended relationship of createContextualFragment and the HTML fragment parsing algorithm is unambiguous (when the context node is an HTML element--if it's from another namespace, things aren't quite so clear).

As Boris said, here the script makes the root element ("html") be the context node (since oRange.setStartBefore(oBody); sets the parent of oRange to the parent of oBody). Once the context node is "html", the rest of the behavior follows from the HTML fragment parsing algorithm as defined in HTML5.

So the behavior here is correct per spec.

Now, the question is if the spec needs changing. The problem is that making any change here would make the fragment parsing behave illogically when the context node is "html". Currently, fragment parsing with "html" as the context behaves the same way as the tree builder behaves when a full document is parsed as the "html" node has been pushed onto the tree builder stack. So special-casing "html" context to behave like a "body" context in the fragment parsing algorithm itself would break the innerHTML setter on the root element. So if a hack were placed somewhere, a better place would be making createContextualFragment special-case "html" as the context node. However, it would be confusing for createContextualFragment not to have the same context node behavior as the innerHTML setter.

If at all feasible, I think it would be better for the script to be fixed in OWA to set up oRange such that oBody becomes its parent than to enshrine some approximation of the quirk of Gecko's old fragment sink as a special case in createContextualFragment forever. Does Microsoft have the ability to push such a patch to OWA customers?
Safari 5.0.1 and the latest WebKit nightly don't create an implicit body when trying the test case. Opera 10.61 creates another body. IE9 PP doesn't support creating the range.

CCing abarth in the hope of discovering whether the WebKit behavior is deliberate for compat or an accident like Gecko's old behavior.

Google Code Search suggests that createContextualFragment is very rarely used outside the Prototype framework. I'm guessing that Prototype isn't relying on the old bug, because we haven't gotten Prototype-related bugs traceable to the HTML5 parser.
> CCing abarth in the hope of discovering whether the WebKit behavior is
> deliberate for compat or an accident like Gecko's old behavior.

createContextualFragment is sadness.  AFAIK, there's no spec for what it should do.  We tried to preserve the existing semantics as much as possible.  It does its madness by performing surgery on the DOM it gets back from the parser.  :(
(In reply to comment #14)
> > CCing abarth in the hope of discovering whether the WebKit behavior is
> > deliberate for compat or an accident like Gecko's old behavior.
> 
> createContextualFragment is sadness.  AFAIK, there's no spec for what it should
> do.  We tried to preserve the existing semantics as much as possible.  It does
> its madness by performing surgery on the DOM it gets back from the parser.  :(

Is there surgery on cases other than "html" as the context node? I only noticed deliberate failures when using a void element as the context node.
These are good questions for Eric.
I found http://trac.webkit.org/changeset/65845 and http://trac.webkit.org/changeset/65372

Both look like more complex ways of achieving the outcome of using "body" as the context for the fragment parsing algorithm when "html" is passed as the context for createContextualFragment.

I think the next step is to write a patch that does that special-casing in Gecko, since this quirk has been exposed to the Web for a long time and OWA might not be the only app depending on it.
Status: NEW → ASSIGNED
Priority: -- → P2
OS: Windows 7 → All
Hardware: x86_64 → All
Version: unspecified → Trunk
Attached patch Approximate the old quirk (obsolete) — Splinter Review
Attachment #469861 - Flags: review?(bzbarsky)
Comment on attachment 469861 [details] [diff] [review]
Approximate the old quirk, with typo fixed in commit message

I'd prefer using IsHTML() to doing the namespace check.

And maybe we should consider an IsHTML() signature taking an atom (to mean "is this an HTML _____?").  Separate bug for that, though.

r=bzbarsky with IsHTML used.
Attachment #469861 - Flags: review?(bzbarsky) → review+
(In reply to comment #20)
> Comment on attachment 469861 [details] [diff] [review]
> Approximate the old quirk, with typo fixed in commit message
> 
> I'd prefer using IsHTML() to doing the namespace check.

Why? The HTMLness of the Document has already been established (http://mxr-test.konigsberg.mozilla.org/mozilla-central/source/content/base/src/nsContentUtils.cpp#3864) if execution reaches the namespace check.
(In reply to comment #21)
> Why? The HTMLness of the Document has already been established
> (http://mxr-test.konigsberg.mozilla.org/mozilla-central/source/content/base/src/nsContentUtils.cpp#3864)
> if execution reaches the namespace check.

Oops. Different IsHTML(). I'll fix.
Blocks: 588462
Pushed http://hg.mozilla.org/mozilla-central/rev/f40c03aed78f

Filed bug 593762 about nsIContent::IsHTML(nsIAtom* aLocalName)
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
No longer blocks: 588462
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: