The document encoder needs to be more flexible




18 years ago
18 years ago


(Reporter: adamlock, Assigned: adamlock)



Windows NT

Firefox Tracking Flags

(Not tracked)



(2 attachments)



18 years ago
I'm trying to clone an HTML document.

It should be possible to do this by querying the document for the nsIDOMNode 
interface and calling cloneNode() method on it. The implementation of cloneNode 
in nsHTMLDocument does nothing at all except return NS_OK.


18 years ago
Blocks: 46574

Comment 1

18 years ago
Adding the embed keyword.  This bug is deemed important by Jud's team.
Keywords: embed

Comment 2

18 years ago
adam, can you ellaborate on the context in which this is necessary? I believe
the issue is that you're trying to recreate a document for save as and printing;

Comment 3

18 years ago
I am writing code to save a DOM document and all attached elements (images, css, 
js) etc. to local storage. The persistence stuff works fine, but I need to fix 
up the document before it is saved so that the image, js & css links all point 
to the local files. This means walking through the DOM, fixing up the elements 
as I go. I want to do the fixup to a copy of the document that I can throw away 
when I'm done with it so that the original is left unchanged.

HTML documents expose the nsIDOMNode interface so calling cloneNode() should be 
the proper way to make a copy. The implementation in HTML document class calls 
through to it's base class and this does nothing at all. All other HTML elements 
also need to implement this method but I haven't got as far as discovering 
whether they work or not.

HTML document's implementation of cloneNode.

And the base class implementation:

Adding a few more keywords
Keywords: api, correctness
Ok, well, I'd like to propose a different approach to writing code that does
what you want. Cloning a docuemnt is more complicated that one would expect
since the document holds on to so many external things that also need to either
be cloned or we'd need to figure out what to do with things that we can't really
clone (say document observers, presshells, event listeners...), also, if we do
implement cloneNode() on the document the code will be hard to maintain since
any time you add a member variable to a document you need to make sure the
cloneNode() method does the right thing with the new member...

The solution I'd like to suggest is that you would in stead of cloning the
document only to modify a few attributes before you serialize it to a file you
should write your own subclass of one of the content serializers (or possibly
add a flag to the existing ones) in mozilla, then you could simply check in the
serializer when you're writing out the value of a HREF, SRC, ... attribute if it
needs changing and do the change for the output only and leave the in-memory
document as is. This will be a lot faster, less bloaty and it'll probably be
less code too, all you need to do is to write a class that overrides a few
methods in nsHTMLContentSerializer (in layout/base/src) and hook that up to the
document encoder (layout/base/src/nsDocumentEncoder.{cpp,h}), or you add a flag
for this...

Would that solve your problem in an acceptable way?

Comment 5

18 years ago
I'll try it the other way though I still believe cloneNode would be the correct
way assuming we knew what we meant by it...

Perhaps I should even be thinking of changing the document encoder because this
is what feeds the elements into the serializer in the first place. If I could
register a callback with this class, it could ask the callback to fixup an
element before passing it to the serializer.

Comment 6

18 years ago
To elaborate on my last point, if I added a new method to nsIDocumentEncoder, 
such as SetNodeFixupCallback, I could have the encoder test for this during the 
when it's walking the DOM with calls to nsDocumentEncoder::SerializeNodeStart. 
If it the callback were set, then SerializeNodeStart would call a method on it, 
e.g. OnFixupNode() which could if it wanted return a new node that would be 
serialized instead of the original. This approach would still require that 
cloneNode worked (and in the way the spec says, copying attribute nodes as well 
even on a non-deep copy), but I wouldn't need cloning to work on the document.

Once the callback code was in place, I would have the relatively trivial task of  
writing my own version of nsDocument::SaveFile which set up the callback. 

Does this sound reasonable?

Comment 7

18 years ago
Created attachment 19069 [details] [diff] [review]
Sample patch for 'hooking' the encoder serialization mechanism with my fixed up nodes

Comment 8

18 years ago
The last patch is how I intend to get the functionality from the encoder that I 

I've created a new nsIDocumentEncoderNodeFixup interface that I can implement 
in my own code and pass to the document encoder via a new method 
nsIDocumentEncoder::SetNodeFixup(). The node serialization code has been changed 
so that if there is a fixup object, it 
calls nsIDocumentEncoderNodeFixup::FixupNode() method during serialization and 
uses whatever node comes out of it. It defaults to the old behaviour if there is 
no fixup object or if the call to FixupNode returns nsnull.

Comment 9

18 years ago
Created attachment 19587 [details] [diff] [review]
Slightly updated version of the patch

Comment 10

18 years ago
Last patch is submitted for review. Johnny, can you do the module owner's review 
I'm not sure I'm the module owner for the document encoder, in fact I don't know
if there is a dedicated module owner for that code but the editor team delas
with that code so I guess they are the owner of that code. You should probably
run the changes by jfrancis.

As for the general idea I like your approach a lot more than having to clone the
whole document only to modify a small number of the attributes in the document,
and your solution alows for pluggable nsIDocumentEncoderNodeFixup thingies,
flexibility is good. I think a more optimal way to achieve the same result is
possible without even having to clone nodes when doing the fixup but I don't see
an easy way to do that in the current document encoder code.

r=jst for the second patch, and since you're all over this I'm reassigning this
bug to you (and changing the summary), if there should be a separate bug on
cloneNode() not working on HTML document lets open up a new one...
Assignee: jst → adamlock
Summary: DOM node method cloneNode() does not work on HTML document nodes → The document encoder needs to be more flexible

Comment 12

18 years ago
Adding Joe Francis to the cc: list. I'd like to get his r= before putting in my sr=.

Comment 13

18 years ago
sr=vidur. Fix checked in. Comments are still welcomed from Joe.
Last Resolved: 18 years ago
Resolution: --- → FIXED

Comment 14

18 years ago
sorry for delay, i was away on vacation.  this all seems orthogonal to any editor 
usage of the encoders, so i don't have anything to add.  thanks for cc'ing me 


18 years ago
Keywords: api
You need to log in before you can comment on or make changes to this bug.