Change the way output is constructed in Transformiix

VERIFIED FIXED in mozilla0.9.9



17 years ago
10 years ago


(Reporter: peterv, Assigned: peterv)


(Blocks: 1 bug, {perf})

Dependency tree / graph

Firefox Tracking Flags

(Not tracked)



(1 attachment, 7 obsolete attachments)



17 years ago
I have rewritten the way the output is constructed in Transformiix. As part of
the rewrite I'm fixing bug 94921, bug 59912, bug 53030 and bug 88198. I'll try
to attach the patch tomorrow, so the reviews can begin. I'm trying to get this
into 0.9.4.


17 years ago
Depends on: 53030, 59912, 88198, 94921


17 years ago
OS: Mac System 8.5 → All
Priority: -- → P1
Target Milestone: --- → mozilla0.9.4


17 years ago
No longer depends on: 53030, 59912, 88198, 94921


17 years ago
Blocks: 53030, 59912, 88198, 94921

Comment 1

17 years ago
Good work Peter, thanks very much :) -mike


17 years ago
Blocks: 53944

Comment 2

17 years ago
Moving out :(.
Target Milestone: mozilla0.9.4 → mozilla0.9.5

Comment 3

17 years ago
Created attachment 48419 [details] [diff] [review]
This is not a black hole

Comment 4

17 years ago
Don't try or review this patch, it's just work-in-progress, to show that there
is (some) progress.


17 years ago
Blocks: 99032

Comment 5

16 years ago
All these missed the bus/train/plane/boat/whatever. Sad.
Target Milestone: mozilla0.9.5 → mozilla0.9.6

Comment 6

16 years ago
based on the experience in bug 103659, do we need the output directory?
I don't see the txAtomList.* in base, either. Those are XSLT special, so they
should go into source/xslt, imho. Maybe this should be txXSLTAtoms, even?
See bug 76070, the naming should be consistent.


Comment 7

16 years ago
Created attachment 56708 [details] [diff] [review]
Incomplete patch

Comment 8

16 years ago
Patch still misses a few files, but this part should be reviewable. I'm going to
move this and dependent bugs to 0.9.7. No way this will be ready and reviewed
for 0.9.6. It'll probably go in soon after 0.9.7 opens.

Comment 9

16 years ago
+            PRInt32 nsID = node->getNamespaceID();
+            if (!nsID == kNameSpaceID_XSLT)
nsID is typedef'd in nsID.h.
nsID != kNameSpaceID_XSLT.

- * Returns the namespace URI for the given namespace prefix, this method should
+ * Returns the namespace ID for the given namespace prefix, this method should
you don't.

+            if (XMLDOMUtils::getNamespaceURI(aPrefix, (Element*)node,

+        if (element->getAttr(txXSLTAtoms::defaultSpace, kNameSpaceID_XSLT,
default-space isn't part of the spec (at least I can't find "default-space"),
and if, it wouldn't be in the XSLT namespace.
(and it could use a newline)
My vote is pruning it.

+                    if (element->getAttr(txXSLTAtoms::omitXmlDeclaration,
wrong namespace.

    if (!xslType)
        return XSLType::LITERAL;
that should error, unknown element in XSLT namespace.

+        const String& nodeName = aXsltAction->getNodeName();
you only need that for LITERALs

more to come (I guess), I'm off to the movies.

Comment 10

16 years ago
The patch for 96647 is nearly done, but it's a bit big and we need time for
reviews, unless something serious comes up this should land early in 0.9.7.
Target Milestone: mozilla0.9.6 → mozilla0.9.7
First of all, this is a VERY nice architectural change. Thanks for doing it!

Since you're fixing it up anyway, could you make XMLDOMUtils::getNodeValue use 
the same code for attributes, textnodes, comments and PIs. Simply append the 

+PRInt32 Document::namespaceURIToID(const String& aNamespaceURI)
+    PRInt32 nsID = kNameSpaceID_Unknown;
+    if (nsNSManager)
+        nsNSManager->GetNameSpaceID(aNamespaceURI.getConstNSString(), nsID);
+    return nsID;

should you use nsNSManager->RegisterNameSpace?

+    if (!aPrefix || aPrefix == txXMLAtoms::_empty)
+        aPrefix = txXMLAtoms::xmlns;

why do you need to do this?

-nsSyncLoader::LoadDocument(nsIURI* documentURI, nsIDocument *aLoader, 
nsIDOMDocument **_retval)
+nsSyncLoader::LoadDocument(nsIURI* documentURI, nsIDocument *aLoader, 
nsIDOMDocument **aResult)

could we keep _retval? result isn't really an argument and the rest of moz uses 

+    if ((mMethod == eMethodNotSet) &&
+        (aOutputFormat.mMethod != eMethodNotSet))
+        mMethod = aOutputFormat.mMethod;

You don't actually have to check (aOutputFormat.mMethod != eMethodNotSet)

+    // Merges in the values of aOutputFormat, members that have no
+    // value in aOutputFormat will not be changed.
+    void merge(txOutputFormat& aOutputFormat);

Could you mention that values that are set in *this* format will not be changed

+    while ((qName = (txQualifiedName*) {
+        mCDATASectionElements.add(qName);
+        iter.remove();
+    }

mmmmm, I like that loop :). But IMHO you could let the txList destructor take 
care of the deletion, that should also save a few cycles.

+ enum txOutputMethod { eMethodNotSet = 0, eXMLOutput, eHTMLOutput, 
eTextOutput };

Pike is gonna get you for the length of that one ;-)
More comments:

+enum txThreeState { eNotSet = -1, eFalse = MB_FALSE, eTrue = MB_TRUE };

are you sure that PR_TRUE isn't defined as -1 on any platform?

+                        String namePart;
+                        XMLUtils::getPrefix(token, namePart);
+                        txAtom* nameAtom = TX_GET_ATOM(namePart);
+                        PRInt32 nsID = element->lookupNamespaceID(nameAtom);

unprefixed means nullnamespace, not default namespace.

+   format.mOmitXMLDeclaration = attValue.isEqual(YES_VALUE) ? eTrue : eFalse;

IMHO we should get rid of YES_VALUE (and a lot of other static strings), would 
using "yes" be ok?

-                    //-- unknown
+                {
+                    // unknown
+               }

please fix indentation of the last '}'

+    // Create wrapper for the source document.
+    nsCOMPtr<nsIDOMDocument> sourceDOMDocument;
+    aSourceDOM->GetOwnerDocument(getter_AddRefs(sourceDOMDocument));
+    if (!sourceDOMDocument)
+        sourceDOMDocument = do_QueryInterface(aSourceDOM);

That shouldn't work if aSourceDOM is the document node. So I don't understand 
how it can work now??

In XSLTProcessor::initializeHandlers, shouldn't you merge the output formats 
before using |aPs->getOutputFormat()->mMethod|?

In XSLTProcessor::TransformDocument, are you sure that ps will be deleted 
before sourceDocument, xslDocument and resultDocument? If it's not, bad things 
will happen.

more to come :)
+ txAtom* nameAtom = TX_GET_ATOM(name);
+ if (nameAtom == txXMLAtoms::xmlns) {

Isn't |name.IsEqual("xmlns")| faster?

+ if ((prefixAtom != txXMLAtoms::_empty) && (prefixAtom != txXMLAtoms::xmlns))
+     resultNsID = actionElement->lookupNamespaceID(prefixAtom);

This'll create attributes with prefix "xmlns" in the null-namespace.

Could you clean up the handling of text/cdata nodes in the stylesheet, we don't 
really need that |String textValue| IMHO

Do we really need ::cdata and ::entityReference in the output handlers?
in XSLTProcessor::processVariable
+        return (NodeSet*)rtf;

Why cast to NodeSet? Shouldn't txResultTreeFragment also be an ExprResult once 
we have it?

+        XMLUtils::normalizeAttributeValue(value);

should this really be done for module?

I don't quite like that the output formats are merged in 
XSLTProcessor::initializeHandlers. It would be nice if we had some
::initializeStylesheet (i thought of a better name yesterday, but i can't 
remember it now) function that is called after all ::processTopLevel are done, 
but before we start the actual transform is started.
Not sure if the function should live in ProcessorState or XSLTProcessor, but 
i'm ok with either way. In that funciton we could do some stuff like merge 
output formats, evaluate global variables and some other foo.

+     XMLUtils::normalizePIValue(value);

cut'n'paste error?

(note to self: file a bug on PI-target-name check)

+        if (!value.isEmpty()) {
+            // XXX Need to handle disable-output-escaping
+            NS_ASSERTION(mResultHandler, "mResultHandler must not be NULL!");
+            mResultHandler->characters(value);
+        }

IMHO there's no need to check for |!value.isEmpty()|

mmmmmm.. i like the new code for LREs

+        StackIterator attributeSets((txList*)&mAttributeSetStack);

is this really safe?

In XSLTProcessor::processDefaultTemplate
+            String value = node->getNodeValue();
+            NS_ASSERTION(mResultHandler, "mResultHandler must not be NULL!");
+            mResultHandler->characters(value);

+            NS_ASSERTION(mResultHandler, "mResultHandler must not be NULL!");
+            mResultHandler->characters(node->getNodeValue());
should save you a string-copy

in XSLTProcessor::copyNode:
CDATA_SECTION_NODEs should IMHO not call mResultHandler->cdata but rather 
mResultHandler->characters, so you could use the same code as for TEXT_NODE; 
just |mResultHandler->characters(aNode->getNodeValue());|

XSLTProcessor::xslCopyOf should sort nodesets in document order before copying 

Comment 15

16 years ago
to build this on unix, both source/xslt/ and build/ need
an additional REQUIRES on unicharutil, and build/ needs
  ../source/xslt/txRtfHandler.$(OBJ_SUFFIX) \
  ../source/xslt/txTextHandler.$(OBJ_SUFFIX) \
  ../source/xslt/txMozillaTextOutput.$(OBJ_SUFFIX) \
  ../source/xslt/txMozillaXMLOutput.$(OBJ_SUFFIX) \
Blocks: 107219
Blocks: 109774
oh, txQualifiedName should really be called txExpandedName

Comment 17

16 years ago
Created attachment 58526 [details] [diff] [review]
Removed files
Attachment #48419 - Attachment is obsolete: true
Attachment #56708 - Attachment is obsolete: true

Comment 18

16 years ago
Created attachment 58527 [details] [diff] [review]

Still no standalone serializers. They should be flying in any moment now :(.

Comment 19

16 years ago
Attr::Attr (standalone) shows that we don't do  XMLUtils::getLocalPart vs.
finding : ourselves consistently. Fix this one? (The only one you add)

+    mAttributes.setNamedItem(temp);
as that iterates over the attributes again.

ProcessorState::ProcessorState constructors could get some :mFoo(aFoo), IMHO

catch the namespace early? Maybe add a 
// XXX Error report warning
for a warning about deprecated namespace and wrong top level element with
XSLT namespace?

Add a comment for the {} around the ProcessorStates, it looks kinda bogus, and
the destructor order is worth mentioning.

:1130 has
                    const String& mode =
could you getAttr that one, too?

txXMLEventHandler.h includes iostream.h twice. I doubt that you need it for

hitting commit now, I regenerated all but one of these comments allready. *rant*

Comment 20

16 years ago
Oh, did anybody say? This makes us twice as fast. (tested on veiligheid/2)
I didn't measure the bloat, but if I saw this right, similar numbers there too.
Keywords: perf
please remove the
+  if (this == &data)
+    return MB_TRUE;
it'll probably never happen anyway

+  const UNICODE_CHAR* otherBuffer = data.toUnicode();
looks better as
+  const UNICODE_CHAR* otherBuffer = data.strBuffer;

+    if (!aPrefix || aPrefix == txXMLAtoms::_empty)
+        aPrefix = txXMLAtoms::xmlns;

I don't quite like that |!aPrefix|, IMHO it should be an error and not need any 

standalone Attr::Attr:
+  if (aNamespaceURI.isEmpty()) {
+    Attr(aName, aOwner);
+  }

That will not behave according to the DOM-spec, since a prefix in the name will 
be looked up. IMHO it would be nicer to always do what's in the |else| and just 
specialcase the namespace setting for empty aNamespaceURI

standalone Element::Element:
+  if (!aNamespaceURI.isEmpty())
+      mNamespaceID = txNamespaceManager::getNamespaceID(aNamespaceURI);

IMHO if aNamespaceURI is empty then mNamespaceID should be set to 

I would've rather that you left cleanups like this for a separate patch, but 
since you do it...
-    attValue = element->getAttribute(DECIMAL_SEPARATOR_ATTR);
+    element->getAttr(txXSLTAtoms::decimalSeparator, kNameSpaceID_None,
+                     attValue);
     if (attValue.length() == 1)
         format->mDecimalSeparator = attValue.charAt(0);
     else if (!attValue.isEmpty())
         success = MB_FALSE;

should be

-    attValue = element->getAttribute(DECIMAL_SEPARATOR_ATTR);
-    if (attValue.length() == 1)
-        format->mDecimalSeparator = attValue.charAt(0);
-    else if (!attValue.isEmpty())
-        success = MB_FALSE;
+    if (element->getAttr(txXSLTAtoms::decimalSeparator, kNameSpaceID_None,
+                         attValue) {
+      if (attValue.length() == 1)
+          format->mDecimalSeparator = attValue.charAt(0);
+      else
+        success = MB_FALSE;
+    }


-    attValue = element->getAttribute(INFINITY_ATTR);
+    element->getAttr(txXSLTAtoms::infinity, kNameSpaceID_None,
+                     attValue);
     if (!attValue.isEmpty())

should be

-    attValue = element->getAttribute(INFINITY_ATTR);
-    if (!attValue.isEmpty())
+    if (element->getAttr(txXSLTAtoms::infinity, kNameSpaceID_None,
+                         attValue))

/me runs for cover

+        if (!importFrame.hasNext())
+            // XXX ErrorReport: out of memory
+            return;

would be nicer as

+        if (!
+            // XXX ErrorReport: out of memory
+            return;


XSLTProcessor::processAction case XSLType::APPLY_TEMPLATES:
+                if (!exprResult)
+                    // XXX ErrorReport: out of memory
+                    break;

This can be other errors then "out of memory". Just do the |break| IMHO. The 
errorhandling should be already taken care of once we get this far. Same for 

XSLTProcessor::processAction case XSLType::ATTRIBUTE:
+                txAtom* nameAtom = TX_GET_ATOM(name);
+                if (nameAtom == txXMLAtoms::xmlns) {

I still think that |name.isEqual("xmlns")| is faster?

+      (prefixAtom != txXMLAtoms::xmlns))

I really don't see that doing <xsl:attribute name="xmlns:foo"/> should create 
an attribute in the null namespace. IMHO the specialcasing for xmlns should be 
removed and ::lookupNamespaceID should return kNameSpaceID_Unknown for "xmlns". 
You could leave the latter for another bug if you want.

XSLTProcessor::processAction case XSLType::COMMENT:
The '-' loop will stop too soon if there is a '-' with an non-'-' following. 
You'll have to always loop all the way to the end, and then check if the last 
char is '-'.

XSLTProcessor::copyNode case Node::ATTRIBUTE_NODE:
+            Attr* attr = (Attr*)aNode;

Why not use aNode directly?

That's about it except for the output handlers, but you gotta explain those to 

Comment 22

16 years ago
I vote against changing the prefixAtom. I see a good benefit in having both
the empty atom and a null atom referring to an empty prefix.
I want to keep the relation between xmlns prefix and the xmlns namespace, too.

> I really don't see that doing <xsl:attribute name="xmlns:foo"/> should create
> an attribute in the null namespace.
reread the spec: if namespace is given, that is used. Otherwise the prefix is
resolved, unless it's xmlns, that is dropped.
Which brings us to the point that the result attribute must not have the prefix
xmlns. Peter, you should cut the xmlns: from the name.
no, xmlns can be dropped for *output*, not when deciding what namespace the attr has.

So for module this isn't a problem at all, but standalone *should* remove "xmlns:" prefixes (probably in txStandaloneOutputHandler::attribute since other ways of creating attributes could also result in attributes with "xmlns:" prefixes). But IMHO that has "lets do that later" written all over it :-) (unless peterv wanna do it now of course)

The reason that IMHO Node::lookupNamespaceID("xmlns") is that neither xslt  ( or xml-infoset ( says that the xmlns prefix maps to the xmlns namespace.

Comment 24

16 years ago
*** Bug 112172 has been marked as a duplicate of this bug. ***


16 years ago
Blocks: 113962


16 years ago
Blocks: 114414


16 years ago
Target Milestone: mozilla0.9.7 → mozilla0.9.8
Blocks: 116534
No longer blocks: 116534
No longer blocks: 88198

Comment 25

16 years ago
Created attachment 63680 [details] [diff] [review]

New year, new patch.
Attachment #58527 - Attachment is obsolete: true
What is the change to the nsXMLContentSink?

in the new attr constructor:
+  if (aNamespaceURI.isEmpty()) {
+    Attr(aName, aOwner);
+  }
I don't think this will init the the namespace correctly if aName is prefixed. 
How about just:
  if (aNamespaceURI.isEmpty())
    mNamespaceID = kNameSpaceID_None;
    mNamespaceID = txNamespaceManager::getNamespaceID(aNamespaceURI);

  specified = MB_TRUE;
  String localPart;
  XMLUtils::getLocalPart(nodeName, localPart);
  mLocalName = TX_GET_ATOM(localPart);
  mNamespaceID = txNamespaceManager::getNamespaceID(aNamespaceURI);
oh, or how you do it for elements?

Woohooo! Support for disable-output-escape! (Calm down people, it's only for 
standalone, not for mozilla).

In ElementAvailableFnCall and SystemPropertyFunctionCall you resolve prefix-
>namespace using wrong node, you use a node from the source document, but 
should use the element from the xsl-stylesheet. I'd say only 
change ::getNameSpace -> ::getPrefix in this patch and we'll do the right thing 

That's all except for the output handlers and the new member-variables to the 
XSLTProcessor, but we'll take that on irc
hrm.. remove the last line from my code-snippet...
don't assert |NS_ASSERTION(element, "No element to add the attribute to.");|
in txMozillaXMLOutput::attribute. That's not our fault but the fault of the 
stylesheet author.

In txMozillaXMLOutput::characters, shouldn't you always mText.Append? So that 
the <title> get a real childnode? (or will it anyway)

still more to come...
The list of empty elements isn't the same as the one in the XSLT spec, 

txXMLOutput::characters outputs too many CDATA start and end "tags". But that 
could wait for later IMHO.
ok, that's it except for the stuff i said on irc (just nsAutoString and <title> 


16 years ago
Blocks: 119854

Comment 31

16 years ago
I miss the Element::setAttributeNS hunk from

For standalone, you lack to set mOut in XSLTProcessor.cpp:2205, which causes
a crasher on stylesheets defaulting to HTML output. You need to copy the mOut
from the old mOutputHandler to the new one.
Get rid of
     ostream* mOutputStream;
in txXMLOutput.h, AFAICT, it ain't used.

The rest is fine with me.

Comment 32

16 years ago
source/xslt/ needs
@@ -38,15 +38,27 @@
                  content_xsl \
                  widget \
                  necko \
+                 unicharutil \

Comment 33

16 years ago
Created attachment 65618 [details] [diff] [review]

Hopefully the last patch. I'll move this bug to the next milestone. Reviews and
superreview are done and will need to be marked after final control of the new
patch. I'll file follow-up bugs for a couple of things that are wrong/need
Attachment #58526 - Attachment is obsolete: true
Attachment #63680 - Attachment is obsolete: true

Comment 34

16 years ago
Comment on attachment 65618 [details] [diff] [review]

Module is fine, but a few changes in
standalone need some tweaks.

make Element a friend of Attr:
class Attr : public NodeDefinition
    // AttrMap and Element need to be friends to be able to update the
    // ownerElement
    friend class AttrMap;
    friend class Element;

(setting is more :-))
#ifdef TX_EXE
	    ostream* out;
	    delete mOutputHandler;
	    mOutputHandler = new txHTMLOutput();
	    NS_ASSERTION(mOutputHandler, "Setting mResultHandler to NULL!");
	    mResultHandler = mOutputHandler;

    mHTMLEmptyTags.put(txHTMLAtoms::col, txHTMLAtoms::col);
or add col to the txHTMLAtoms.
the standalone output handlers all need
getOutputStream, I factored those into
I'll attach a patch for those, that's faster.

with those,
Attachment #65618 - Flags: review+

Comment 35

16 years ago
Created attachment 65641 [details] [diff] [review]
patch against peterv's tree, factor mOut to txStreamXMLEventHandler

Here is the promised patch. Sounds like this is the right solution to the issue

with setOutputStream. getOutputStream was duplicated code, too.
You get a warning (which i think should be a builderror) on
     StackIterator attributeSets((txList*)&mAttributeSetStack);
change it to
     StackIterator *attributeSets = mAttributeSetStack.iterator();
     if (!attributeSets)
(don't forget to delete)

typo in source/xslt/, change |unicharutils| to |unicharutil|

You still have a few |if (mOutputFormat.mIndent)| in txXMLOutput.cpp

with that and Pikes stuff, r=sicking
oh, and you add quite a few tabs to 

Document* XSLTProcessor::process(Document& xmlDocument,
                                 Document& xslDocument)

Actually, IMHO it would be ok to remove that function all together. I'm 
presonally not even convinced that we want it, and if/when we do it'll be easy 
to add. Your call...

Comment 39

16 years ago
Created attachment 66002 [details] [diff] [review]
Attachment #65618 - Attachment is obsolete: true
Attachment #65641 - Attachment is obsolete: true

Comment 40

16 years ago
Patch for bug 96647 should be final, so these will be fixed early in 0.9.9
Target Milestone: mozilla0.9.8 → mozilla0.9.9
Comment on attachment 66002 [details] [diff] [review]

/me cheers

Attachment #66002 - Flags: review+
Comment on attachment 66002 [details] [diff] [review]

/me cheers

Comment on attachment 66002 [details] [diff] [review]

/me cheers


Comment 44

16 years ago
Comment on attachment 66002 [details] [diff] [review]

Comment 45

16 years ago
Comment on attachment 66002 [details] [diff] [review]

Marking jst's superreview.
Attachment #66002 - Flags: superreview+
Here's a few nits I found while looking through this patch:

- What's the deal with String::isEqualIgnoreCase()? It deals with unicode, it's
only case-insensitive for ASCII characters.

- In source/xml/dom/standalone/Document.cpp, inconsistent indentation.

- In txMozillaTextOutput::setOutputDocument(), more error checking needed.

- In txMozillaXMLOutput.cpp:

+nsresult txMozillaXMLOutput::getRootContent(nsIContent** aReturn)

This is an internal method, right? If so, no need for the NS_ENSURE_ARG_POINTER,
just don't pass null to this method.

- In txOutputFormat.cpp:

+void txOutputFormat::reset()
+    mMethod = eMethodNotSet;
+    mVersion.clear();
+    if (mEncoding.isEmpty())
+    mOmitXMLDeclaration = eNotSet;

no indentation after if?

Comment 47

16 years ago
Fix checked in. I'll be filing the follow-up bugs soon.
Last Resolved: 16 years ago
Resolution: --- → FIXED

Comment 48

16 years ago

Orange juice to peterv :-)
You need to log in before you can comment on or make changes to this bug.