Closed Bug 562013 Opened 14 years ago Closed 14 years ago

Consider caching the documentFragment for innerHTML, or at least cache the nodeinfo, or parse directly to the destination node

Tracking

()

Status:

RESOLVED FIXED

People

(Reporter: smaug, Assigned: smaug)

References

(Blocks 1 open bug)

Details

Attachments

(8 files, 2 obsolete files)

Something like this 14 years ago Olli Pettay [:smaug][bugs@pettay.fi] 7.49 KB, patch		Details \| Diff \| Splinter Review
a test (innerHTML = "some hml, not only plain text") 14 years ago Olli Pettay [:smaug][bugs@pettay.fi] 914 bytes, text/html		Details
a bit better patch 14 years ago Olli Pettay [:smaug][bugs@pettay.fi] 12.46 KB, patch	sicking : feedback+	Details \| Diff \| Splinter Review
Don't use document fragment at all with html5 parser 14 years ago Olli Pettay [:smaug][bugs@pettay.fi] 4.18 KB, patch	sicking : review+ hsivonen : feedback+	Details \| Diff \| Splinter Review
form testcase 14 years ago Olli Pettay [:smaug][bugs@pettay.fi] 4.18 KB, application/octet-stream		Details
er, this is the form test 14 years ago Olli Pettay [:smaug][bugs@pettay.fi] 351 bytes, text/html		Details
This breaks 14 years ago Olli Pettay [:smaug][bugs@pettay.fi] 353 bytes, text/html		Details
Make the parser not associate elements with the form pointer in the fragment mode 14 years ago Henri Sivonen (:hsivonen) 4.63 KB, patch	sicking : review+	Details \| Diff \| Splinter Review
both patches 14 years ago Olli Pettay [:smaug][bugs@pettay.fi] 8.81 KB, patch		Details \| Diff \| Splinter Review
without XML parsing error reporting problem 14 years ago Olli Pettay [:smaug][bugs@pettay.fi] 8.82 KB, patch		Details \| Diff \| Splinter Review

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Description

•

14 years ago

This shows up in the profiles. Pretty high even with .innerHTML = "some simple string"

Caching documentFragment safely can be tricky.

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 1

•

14 years ago

Or actually, if there are no mutation event listeners, caching of document fragment should be doable.

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 2

•

14 years ago

Attached patch Something like this — Details — Splinter Review

The patch is still a bit ugly.
There shouldn't be any need to add the extra QI.

But even this seems to cut 5% of https://bugzilla.mozilla.org/attachment.cgi?id=289857 (which tests setting plain text to innerHTML)

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 3

•

14 years ago

Attached file a test (innerHTML = "some hml, not only plain text") — Details

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 4

•

14 years ago

Attached patch a bit better patch — Details — Splinter Review

With the patch testcase runs faster in Gecko than in trunk webkit.

https://bugzilla.mozilla.org/attachment.cgi?id=289857 is still faster in
Webkit, but it cheats and works against the spec.

The patch could be probably a bit simpler if html5 was enabled always.

Attachment #441851 - Flags: feedback?(jonas)

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 5

•

14 years ago

And we should eventually convert all the callers to use nsINode, never
nsIDOMXXX.

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 6

•

14 years ago

I'd say simplify and always create a docfragment when we're creating a document.

Though ideally I think we should parse directly into the target node, and avoid using a document fragment altogether. That should also save us a bunch of calls to BindToTree/UnbindFromTree.

Might be easier to do only for the HTML5 parser. (and fall back to a slower path when the html5 parser isn't enabled for now, until we kill the old parser)

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 7

•

14 years ago

If you do go for the parse-directly-to-the-destination path, then just fire mutation events manually after the parsing. The same way that the insert-doc-fragment code does.

Jonas Sicking (:sicking) No longer reading bugmail consistently

Updated

•

14 years ago

Attachment #441851 - Flags: feedback?(jonas) → feedback+

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 8

•

14 years ago

Parsing directly to target node might be a good option for html5
but mutation event handling would be a bit ugly.
So for now using fragment could be ok.

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 9

•

14 years ago

Or Henri, what do you think about parsing directly to the destination?

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 10

•

14 years ago

(Btw, seems like fragment's mutation event handling doesn't dispatch
events in all the cases when it should :( )

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 11

•

14 years ago

Given that mutation events is something that should be phased out, I don't think we should architect too much of our code around their existence.

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 12

•

14 years ago

which again would be in favor of using document fragment.
I'd rather not write yet another case for dispatching mutation events.
Document fragment got it already wrong.

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 13

•

14 years ago

...which I'll fix.

Henri Sivonen (:hsivonen)

Comment 14

•

14 years ago

(In reply to comment #9)
> Or Henri, what do you think about parsing directly to the destination?

I already considered it. See the thread starting at
http://lists.w3.org/Archives/Public/public-html/2008Nov/0192.html

However, I decided not to try anything new in order to avoid having to emulate the right mutation events. (The form pointer and <base> also came up, but I'm not sure if those would still be actual problems.)

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 15

•

14 years ago

I think what's needed is just making sure that we don't bind controls that aren't inside a <form> to the last <form>. Instead they should be bound to any ancestor <form> of the node which we're setting .innerHTML on.

As for the mutation events, they seem like a silly reason to not do this perf optimization. We can likely share the code with the fragment-insertion code (once that's fixed). Eventually that code should go away anyway.

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 16

•

14 years ago

<base> shouldn't be a problem any more now that it's self-managed and doesn't require any content sink support.

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 17

•

14 years ago

I'm not going to implement the parse-directly-to-the-destination in this bug.

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 18

•

14 years ago

hmm, or let me re-think. The current setup is nicer, but slower...

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Updated

•

14 years ago

Depends on: 562652

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 19

•

14 years ago

Attached patch Don't use document fragment at all with html5 parser — Details — Splinter Review

This depends on bug 562652.
Needs still some testing.

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 20

•

14 years ago

Comment on attachment 442401 [details] [diff] [review]
Don't use document fragment at all with html5 parser

Henri, would this be all wrong?

Attachment #442401 - Flags: feedback?(hsivonen)

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 21

•

14 years ago

Yay!!

This doesn't deal with ensuring that orphaned <input>s end up in the correct <form>, does it? I.e. for markup like

<form id=A>
  <div id=target></div>
</form>

$("target").innerHTML = "<p><form id=B></p><input id=orphan>"
is($("orphan").form, $("A"))

should pass

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 22

•

14 years ago

Attached file form testcase (obsolete) — Details

HTML5 parser changes behavior in testcase anyway.
The patch in this bug keeps the existing html5 parser behavior.

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Updated

•

14 years ago

Attachment #442448 - Attachment is obsolete: true

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 23

•

14 years ago

Attached file er, this is the form test — Details

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 24

•

14 years ago

Oh, the <form> closes the <p> in the HTML5 parser. The test needs to be

<form id=A>
  <div id=target></div>
</form>

$("target").innerHTML = "<div><form id=B></div><input id=orphan>"
is($("orphan").form, $("A"))


Or possibly even

$("target").innerHTML = "<table><tr><td><form id=B></table><input id=orphan>"

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 25

•

14 years ago

Attached file This breaks — Details

I have no idea how to fix this case. I'd rather not hack the parser.

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 26

•

14 years ago

Or, I don't understand  why the parser or something else does what it does.

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 27

•

14 years ago

What is needed is to set a flag in the parser not perform the steps here:

http://mxr.mozilla.org/mozilla-central/source/parser/html/nsHtml5TreeOperation.cpp#453

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 28

•

14 years ago

Henri, could you perhaps tell how to do that? Or perhaps even provide the patch.

Henri Sivonen (:hsivonen)

Comment 29

•

14 years ago

Comment on attachment 442401 [details] [diff] [review]
Don't use document fragment at all with html5 parser

>   // Remove childnodes
>   nsContentUtils::SetNodeTextContent(this, EmptyString(), PR_FALSE);
[...]
>+    PRInt32 oldChildCount = GetChildCount();

Isn't oldChildCount always zero here? Or more to the point, if it isn't zero, isn't that an at least assertion-worthy problem?

>   if (scripts_enabled) {
>     // If we disabled scripts, re-enable them now that we're
>     // done. Don't fire JS timeouts when enabling the context here.
> 
>     loader->SetEnabled(PR_TRUE);
>   }

Is this script disabling and re-enabling for <script> elements or also for something else? The HTML5 parser in the fragment mode already takes care of marking <script>s already executed.

feedback+ suggesting those points to be reconsidered.

(In reply to comment #28)
> Henri, could you perhaps tell how to do that? Or perhaps even provide the
> patch.

I can provide a patch if we first figure out if we want to make the parser not associate elements with a form ever in the fragment case (including createContextualFragment on ranges) or if we want to special-case the innerHTML setter only. Do we care about supporting the case where someone calls createContextualFragment with a malformed form and proceeds to submit the form directly from the fragment before inserting it elsewhere first?

Attachment #442401 - Flags: feedback?(hsivonen) → feedback+

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 30

•

14 years ago

> I can provide a patch if we first figure out if we want to make the parser not
> associate elements with a form ever in the fragment case (including
> innerHTML createContextualFragment on ranges) or if we want to special-case
> the setter only. Do we care about supporting the case where someone calls
> createContextualFragment with a malformed form and proceeds to submit the form
> directly from the fragment before inserting it elsewhere first?

I don't think it's terribly important if we special case .innerHTML, or if we do it for all fragment parsing. Simplicity seems like the more important goal. Until proven otherwise, I think it's safe to guess that the web doesn't depend on any particular behavior for createContextualFragment.

However all else being equal, I would say that the explicit form-binding thing is fairly evil, and we should limit it as much as possible.

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 31

•

14 years ago

(In reply to comment #29)
> Isn't oldChildCount always zero here? Or more to the point, if it isn't zero,
> isn't that an at least assertion-worthy problem?

Because of mutation events, oldChildCount may not be zero and IMO
there isn't anything to assert because we handle the case.

I don't change scriptloader handling in the patch, but if there is something
to change, I could file a followup bug.





> 
> >   if (scripts_enabled) {
> >     // If we disabled scripts, re-enable them now that we're
> >     // done. Don't fire JS timeouts when enabling the context here.
> > 
> >     loader->SetEnabled(PR_TRUE);
> >   }
> 
> Is this script disabling and re-enabling for <script> elements or also for
> something else? The HTML5 parser in the fragment mode already takes care of
> marking <script>s already executed.
> 
> feedback+ suggesting those points to be reconsidered.
> 
> (In reply to comment #28)
> > Henri, could you perhaps tell how to do that? Or perhaps even provide the
> > patch.
> 
> I can provide a patch if we first figure out if we want to make the parser not
> associate elements with a form ever in the fragment case (including
> createContextualFragment on ranges) or if we want to special-case the innerHTML
> setter only. Do we care about supporting the case where someone calls
> createContextualFragment with a malformed form and proceeds to submit the form
> directly from the fragment before inserting it elsewhere first?

Henri Sivonen (:hsivonen)

Comment 32

•

14 years ago

Attached patch Make the parser not associate elements with the form pointer in the fragment mode — Details — Splinter Review

(In reply to comment #30)
> > I can provide a patch if we first figure out if we want to make the parser not
> > associate elements with a form ever in the fragment case (including
> > innerHTML createContextualFragment on ranges) or if we want to special-case
> > the setter only. Do we care about supporting the case where someone calls
> > createContextualFragment with a malformed form and proceeds to submit the form
> > directly from the fragment before inserting it elsewhere first?
> 
> I don't think it's terribly important if we special case .innerHTML, or if we
> do it for all fragment parsing. Simplicity seems like the more important goal.
> Until proven otherwise, I think it's safe to guess that the web doesn't depend
> on any particular behavior for createContextualFragment.
> 
> However all else being equal, I would say that the explicit form-binding thing
> is fairly evil, and we should limit it as much as possible.

OK. I've attached a patch that makes the fragment case in the parser core not associate elements with the form pointer. (I'm assuming that we'll request this change in the spec, so I've put the change in the cross-language part of the parser instead of putting it into the Gecko integration layer.)

(In reply to comment #31)
> (In reply to comment #29)
> > Isn't oldChildCount always zero here? Or more to the point, if it isn't zero,
> > isn't that an at least assertion-worthy problem?
> 
> Because of mutation events, oldChildCount may not be zero and IMO
> there isn't anything to assert because we handle the case.

OK.

> I don't change scriptloader handling in the patch, but if there is something
> to change, I could file a followup bug.

I think it's worthwhile to investigate, as a follow-up, if the scriptloader part can be optimized away.

Attachment #443061 - Flags: review?(jonas)

Henri Sivonen (:hsivonen)

Comment 33

•

14 years ago

The patch doesn't handle Document.innerHTML = "foo" case, since that case doesn't have a context node, but then this bug isn't about optimizing that case anyway.

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 34

•

14 years ago

There is document.innerHTML? Gecko doesn't support that.
I'll file a new bug for that.

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Updated

•

14 years ago

Depends on: 563322

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Updated

•

14 years ago

Attachment #442401 - Flags: review?(jonas)

Jonas Sicking (:sicking) No longer reading bugmail consistently

Updated

•

14 years ago

Attachment #443061 - Flags: review?(jonas) → review+

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Updated

•

14 years ago

Assignee: nobody → Olli.Pettay

Summary: Consider caching the documentFragment for innerHTML, or at least cache the nodeinfo → Consider caching the documentFragment for innerHTML, or at least cache the nodeinfo, or parse directly to the destination node

Jonas Sicking (:sicking) No longer reading bugmail consistently

Updated

•

14 years ago

Attachment #442401 - Flags: review?(jonas) → review+

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 35

•

14 years ago

Attached patch both patches (obsolete) — Details — Splinter Review

I'll push this once it has passed tests on tryserver

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 36

•

14 years ago

Attached patch without XML parsing error reporting problem — Details — Splinter Review

Attachment #443101 - Attachment is obsolete: true

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 37

•

14 years ago

http://hg.mozilla.org/mozilla-central/rev/d660bc26395e

Status: NEW → RESOLVED

Closed: 14 years ago

Resolution: --- → FIXED

Mike Shaver (:shaver -- probably not reading bugmail closely)

Comment 38

•

14 years ago

This is great stuff -- will it show up on any of our existing benchmarks, or should we add something to track?

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 39

•

14 years ago

Dromaeo tests some innerHTML, though just activating html5 should have given
significant speed up there.

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 40

•

13 years ago

For what it's worth, mutation events should no longer be a concern for parsing directly to the destination. As long as a update batch spans the whole function, all mutation events will be delayed until that update batch is ended.

Nobody; OK to take it and work on it

Updated

•

5 years ago

Component: DOM → DOM: Core & HTML

You need to log in before you can comment on or make changes to this bug.