Closed
Bug 96647
Opened 23 years ago
Closed 23 years ago
Change the way output is constructed in Transformiix
Categories
(Core :: XSLT, defect, P1)
Core
XSLT
Tracking
()
VERIFIED
FIXED
mozilla0.9.9
People
(Reporter: peterv, Assigned: peterv)
References
Details
(Keywords: perf)
Attachments
(1 file, 7 obsolete files)
401.65 KB,
patch
|
sicking
:
review+
peterv
:
superreview+
|
Details | Diff | Splinter Review |
I have rewritten the way the output is constructed in Transformiix. As part of the rewrite I'm fixing bug 94921, bug 59912, bug 53030 and bug 88198. I'll try to attach the patch tomorrow, so the reviews can begin. I'm trying to get this into 0.9.4.
Assignee | ||
Updated•23 years ago
|
Assignee | ||
Updated•23 years ago
|
OS: Mac System 8.5 → All
Priority: -- → P1
Target Milestone: --- → mozilla0.9.4
Comment 1•23 years ago
|
||
Good work Peter, thanks very much :) -mike
Assignee | ||
Comment 3•23 years ago
|
||
Assignee | ||
Comment 4•23 years ago
|
||
Don't try or review this patch, it's just work-in-progress, to show that there is (some) progress.
Assignee | ||
Comment 5•23 years ago
|
||
All these missed the bus/train/plane/boat/whatever. Sad.
Target Milestone: mozilla0.9.5 → mozilla0.9.6
Comment 6•23 years ago
|
||
based on the experience in bug 103659, do we need the output directory? I don't see the txAtomList.* in base, either. Those are XSLT special, so they should go into source/xslt, imho. Maybe this should be txXSLTAtoms, even? See bug 76070, the naming should be consistent. Axel
Assignee | ||
Comment 7•23 years ago
|
||
Assignee | ||
Comment 8•23 years ago
|
||
Patch still misses a few files, but this part should be reviewable. I'm going to move this and dependent bugs to 0.9.7. No way this will be ready and reviewed for 0.9.6. It'll probably go in soon after 0.9.7 opens.
Comment 9•23 years ago
|
||
ProcessorState.cpp: + PRInt32 nsID = node->getNamespaceID(); + if (!nsID == kNameSpaceID_XSLT) nsID is typedef'd in nsID.h. nsID != kNameSpaceID_XSLT. - * Returns the namespace URI for the given namespace prefix, this method should + * Returns the namespace ID for the given namespace prefix, this method should you don't. + if (XMLDOMUtils::getNamespaceURI(aPrefix, (Element*)node, aNamespaceURI)) newline? + if (element->getAttr(txXSLTAtoms::defaultSpace, kNameSpaceID_XSLT, value) default-space isn't part of the spec (at least I can't find "default-space"), and if, it wouldn't be in the XSLT namespace. (and it could use a newline) My vote is pruning it. XSLTProcessor.cpp: + if (element->getAttr(txXSLTAtoms::omitXmlDeclaration, wrong namespace. if (!xslType) return XSLType::LITERAL; that should error, unknown element in XSLT namespace. XSLTProcessor::processAction + const String& nodeName = aXsltAction->getNodeName(); you only need that for LITERALs more to come (I guess), I'm off to the movies.
Assignee | ||
Comment 10•23 years ago
|
||
The patch for 96647 is nearly done, but it's a bit big and we need time for reviews, unless something serious comes up this should land early in 0.9.7.
Target Milestone: mozilla0.9.6 → mozilla0.9.7
First of all, this is a VERY nice architectural change. Thanks for doing it! Since you're fixing it up anyway, could you make XMLDOMUtils::getNodeValue use the same code for attributes, textnodes, comments and PIs. Simply append the nodeValue. +PRInt32 Document::namespaceURIToID(const String& aNamespaceURI) +{ + PRInt32 nsID = kNameSpaceID_Unknown; + if (nsNSManager) + nsNSManager->GetNameSpaceID(aNamespaceURI.getConstNSString(), nsID); + return nsID; +} should you use nsNSManager->RegisterNameSpace? + if (!aPrefix || aPrefix == txXMLAtoms::_empty) + aPrefix = txXMLAtoms::xmlns; + why do you need to do this? -nsSyncLoader::LoadDocument(nsIURI* documentURI, nsIDocument *aLoader, nsIDOMDocument **_retval) +nsSyncLoader::LoadDocument(nsIURI* documentURI, nsIDocument *aLoader, nsIDOMDocument **aResult) could we keep _retval? result isn't really an argument and the rest of moz uses _retval. + if ((mMethod == eMethodNotSet) && + (aOutputFormat.mMethod != eMethodNotSet)) + mMethod = aOutputFormat.mMethod; You don't actually have to check (aOutputFormat.mMethod != eMethodNotSet) + // Merges in the values of aOutputFormat, members that have no + // value in aOutputFormat will not be changed. + void merge(txOutputFormat& aOutputFormat); Could you mention that values that are set in *this* format will not be changed + while ((qName = (txQualifiedName*)iter.next())) { + mCDATASectionElements.add(qName); + iter.remove(); + } mmmmm, I like that loop :). But IMHO you could let the txList destructor take care of the deletion, that should also save a few cycles. + enum txOutputMethod { eMethodNotSet = 0, eXMLOutput, eHTMLOutput, eTextOutput }; Pike is gonna get you for the length of that one ;-)
More comments: +enum txThreeState { eNotSet = -1, eFalse = MB_FALSE, eTrue = MB_TRUE }; are you sure that PR_TRUE isn't defined as -1 on any platform? + String namePart; + XMLUtils::getPrefix(token, namePart); + txAtom* nameAtom = TX_GET_ATOM(namePart); + PRInt32 nsID = element->lookupNamespaceID(nameAtom); unprefixed means nullnamespace, not default namespace. + format.mOmitXMLDeclaration = attValue.isEqual(YES_VALUE) ? eTrue : eFalse; IMHO we should get rid of YES_VALUE (and a lot of other static strings), would using "yes" be ok? - //-- unknown + { + // unknown break; + } please fix indentation of the last '}' + // Create wrapper for the source document. + nsCOMPtr<nsIDOMDocument> sourceDOMDocument; + aSourceDOM->GetOwnerDocument(getter_AddRefs(sourceDOMDocument)); + if (!sourceDOMDocument) + sourceDOMDocument = do_QueryInterface(aSourceDOM); That shouldn't work if aSourceDOM is the document node. So I don't understand how it can work now?? In XSLTProcessor::initializeHandlers, shouldn't you merge the output formats before using |aPs->getOutputFormat()->mMethod|? In XSLTProcessor::TransformDocument, are you sure that ps will be deleted before sourceDocument, xslDocument and resultDocument? If it's not, bad things will happen. more to come :)
+ txAtom* nameAtom = TX_GET_ATOM(name); + if (nameAtom == txXMLAtoms::xmlns) { Isn't |name.IsEqual("xmlns")| faster? + if ((prefixAtom != txXMLAtoms::_empty) && (prefixAtom != txXMLAtoms::xmlns)) + resultNsID = actionElement->lookupNamespaceID(prefixAtom); This'll create attributes with prefix "xmlns" in the null-namespace. Could you clean up the handling of text/cdata nodes in the stylesheet, we don't really need that |String textValue| IMHO Do we really need ::cdata and ::entityReference in the output handlers?
in XSLTProcessor::processVariable + return (NodeSet*)rtf; Why cast to NodeSet? Shouldn't txResultTreeFragment also be an ExprResult once we have it? in XSLType::ATTRIBUTE: + XMLUtils::normalizeAttributeValue(value); should this really be done for module? I don't quite like that the output formats are merged in XSLTProcessor::initializeHandlers. It would be nice if we had some ::initializeStylesheet (i thought of a better name yesterday, but i can't remember it now) function that is called after all ::processTopLevel are done, but before we start the actual transform is started. Not sure if the function should live in ProcessorState or XSLTProcessor, but i'm ok with either way. In that funciton we could do some stuff like merge output formats, evaluate global variables and some other foo. in XSLType::COMMENT: + XMLUtils::normalizePIValue(value); cut'n'paste error? (note to self: file a bug on PI-target-name check) in XSLType::VALUE_OF: + if (!value.isEmpty()) { + // XXX Need to handle disable-output-escaping + NS_ASSERTION(mResultHandler, "mResultHandler must not be NULL!"); + mResultHandler->characters(value); + } IMHO there's no need to check for |!value.isEmpty()| mmmmmm.. i like the new code for LREs + StackIterator attributeSets((txList*)&mAttributeSetStack); is this really safe? In XSLTProcessor::processDefaultTemplate + String value = node->getNodeValue(); + NS_ASSERTION(mResultHandler, "mResultHandler must not be NULL!"); + mResultHandler->characters(value); + NS_ASSERTION(mResultHandler, "mResultHandler must not be NULL!"); + mResultHandler->characters(node->getNodeValue()); should save you a string-copy in XSLTProcessor::copyNode: CDATA_SECTION_NODEs should IMHO not call mResultHandler->cdata but rather mResultHandler->characters, so you could use the same code as for TEXT_NODE; just |mResultHandler->characters(aNode->getNodeValue());| XSLTProcessor::xslCopyOf should sort nodesets in document order before copying them
Comment 15•23 years ago
|
||
to build this on unix, both source/xslt/Makefile.in and build/Makefile.in need an additional REQUIRES on unicharutil, and build/Makefile.in needs ../source/xslt/txRtfHandler.$(OBJ_SUFFIX) \ ../source/xslt/txTextHandler.$(OBJ_SUFFIX) \ ../source/xslt/txMozillaTextOutput.$(OBJ_SUFFIX) \ ../source/xslt/txMozillaXMLOutput.$(OBJ_SUFFIX) \
Blocks: 107219
Blocks: 109774
oh, txQualifiedName should really be called txExpandedName
Assignee | ||
Comment 17•23 years ago
|
||
Attachment #48419 -
Attachment is obsolete: true
Attachment #56708 -
Attachment is obsolete: true
Assignee | ||
Comment 18•23 years ago
|
||
Still no standalone serializers. They should be flying in any moment now :(.
Comment 19•23 years ago
|
||
Attr::Attr (standalone) shows that we don't do XMLUtils::getLocalPart vs. finding : ourselves consistently. Fix this one? (The only one you add) standalone/Element.cpp Element::setAttributeNS don't + mAttributes.setNamedItem(temp); as that iterates over the attributes again. ProcessorState::ProcessorState constructors could get some :mFoo(aFoo), IMHO XSLTProcessor::processStylesheet catch the namespace early? Maybe add a // XXX Error report warning for a warning about deprecated namespace and wrong top level element with XSLT namespace? Add a comment for the {} around the ProcessorStates, it looks kinda bogus, and the destructor order is worth mentioning. :1130 has const String& mode = actionElement->getAttribute(MODE_ATTR); could you getAttr that one, too? txXMLEventHandler.h includes iostream.h twice. I doubt that you need it for module. hitting commit now, I regenerated all but one of these comments allready. *rant*
Comment 20•23 years ago
|
||
Oh, did anybody say? This makes us twice as fast. (tested on veiligheid/2) I didn't measure the bloat, but if I saw this right, similar numbers there too.
Keywords: perf
String::isEqualIgnoreCase: please remove the + if (this == &data) + return MB_TRUE; it'll probably never happen anyway + const UNICODE_CHAR* otherBuffer = data.toUnicode(); looks better as + const UNICODE_CHAR* otherBuffer = data.strBuffer; IMHO Node::lookupNamespaceID: + if (!aPrefix || aPrefix == txXMLAtoms::_empty) + aPrefix = txXMLAtoms::xmlns; + I don't quite like that |!aPrefix|, IMHO it should be an error and not need any specialcasing standalone Attr::Attr: + if (aNamespaceURI.isEmpty()) { + Attr(aName, aOwner); + } That will not behave according to the DOM-spec, since a prefix in the name will be looked up. IMHO it would be nicer to always do what's in the |else| and just specialcase the namespace setting for empty aNamespaceURI standalone Element::Element: + if (!aNamespaceURI.isEmpty()) + mNamespaceID = txNamespaceManager::getNamespaceID(aNamespaceURI); IMHO if aNamespaceURI is empty then mNamespaceID should be set to kNameSpaceID_None ProcessorState::addDecimalFormat: I would've rather that you left cleanups like this for a separate patch, but since you do it... - attValue = element->getAttribute(DECIMAL_SEPARATOR_ATTR); + element->getAttr(txXSLTAtoms::decimalSeparator, kNameSpaceID_None, + attValue); if (attValue.length() == 1) format->mDecimalSeparator = attValue.charAt(0); else if (!attValue.isEmpty()) success = MB_FALSE; should be - attValue = element->getAttribute(DECIMAL_SEPARATOR_ATTR); - if (attValue.length() == 1) - format->mDecimalSeparator = attValue.charAt(0); - else if (!attValue.isEmpty()) - success = MB_FALSE; + if (element->getAttr(txXSLTAtoms::decimalSeparator, kNameSpaceID_None, + attValue) { + if (attValue.length() == 1) + format->mDecimalSeparator = attValue.charAt(0); + else + success = MB_FALSE; + } and - attValue = element->getAttribute(INFINITY_ATTR); + element->getAttr(txXSLTAtoms::infinity, kNameSpaceID_None, + attValue); if (!attValue.isEmpty()) format->mInfinity=attValue; should be - attValue = element->getAttribute(INFINITY_ATTR); - if (!attValue.isEmpty()) + if (element->getAttr(txXSLTAtoms::infinity, kNameSpaceID_None, + attValue)) format->mInfinity=attValue; /me runs for cover XSLTProcessor::process: + if (!importFrame.hasNext()) + // XXX ErrorReport: out of memory + return; + importFrame.next(); would be nicer as + if (!importFrame.next()) + // XXX ErrorReport: out of memory + return; IMHO XSLTProcessor::processAction case XSLType::APPLY_TEMPLATES: + if (!exprResult) + // XXX ErrorReport: out of memory + break; This can be other errors then "out of memory". Just do the |break| IMHO. The errorhandling should be already taken care of once we get this far. Same for FOR_EACH and IF XSLTProcessor::processAction case XSLType::ATTRIBUTE: + txAtom* nameAtom = TX_GET_ATOM(name); + if (nameAtom == txXMLAtoms::xmlns) { I still think that |name.isEqual("xmlns")| is faster? + (prefixAtom != txXMLAtoms::xmlns)) I really don't see that doing <xsl:attribute name="xmlns:foo"/> should create an attribute in the null namespace. IMHO the specialcasing for xmlns should be removed and ::lookupNamespaceID should return kNameSpaceID_Unknown for "xmlns". You could leave the latter for another bug if you want. XSLTProcessor::processAction case XSLType::COMMENT: The '-' loop will stop too soon if there is a '-' with an non-'-' following. You'll have to always loop all the way to the end, and then check if the last char is '-'. XSLTProcessor::copyNode case Node::ATTRIBUTE_NODE: + Attr* attr = (Attr*)aNode; Why not use aNode directly? That's about it except for the output handlers, but you gotta explain those to me
Comment 22•23 years ago
|
||
I vote against changing the prefixAtom. I see a good benefit in having both
the empty atom and a null atom referring to an empty prefix.
I want to keep the relation between xmlns prefix and the xmlns namespace, too.
> I really don't see that doing <xsl:attribute name="xmlns:foo"/> should create
> an attribute in the null namespace.
reread the spec: if namespace is given, that is used. Otherwise the prefix is
resolved, unless it's xmlns, that is dropped.
Which brings us to the point that the result attribute must not have the prefix
xmlns. Peter, you should cut the xmlns: from the name.
no, xmlns can be dropped for *output*, not when deciding what namespace the attr has. So for module this isn't a problem at all, but standalone *should* remove "xmlns:" prefixes (probably in txStandaloneOutputHandler::attribute since other ways of creating attributes could also result in attributes with "xmlns:" prefixes). But IMHO that has "lets do that later" written all over it :-) (unless peterv wanna do it now of course) The reason that IMHO Node::lookupNamespaceID("xmlns") is that neither xslt (http://www.w3.org/TR/xslt#section-Expressions) or xml-infoset (http://www.w3.org/TR/xml-infoset/#infoitem.element) says that the xmlns prefix maps to the xmlns namespace.
Comment 24•23 years ago
|
||
*** Bug 112172 has been marked as a duplicate of this bug. ***
Assignee | ||
Updated•23 years ago
|
Target Milestone: mozilla0.9.7 → mozilla0.9.8
Blocks: 116534
No longer blocks: 116534
No longer blocks: 88198
Assignee | ||
Comment 25•23 years ago
|
||
New year, new patch.
Attachment #58527 -
Attachment is obsolete: true
What is the change to the nsXMLContentSink?
in the new attr constructor:
+ if (aNamespaceURI.isEmpty()) {
+ Attr(aName, aOwner);
+ }
I don't think this will init the the namespace correctly if aName is prefixed.
How about just:
{
if (aNamespaceURI.isEmpty())
mNamespaceID = kNameSpaceID_None;
else
mNamespaceID = txNamespaceManager::getNamespaceID(aNamespaceURI);
specified = MB_TRUE;
String localPart;
XMLUtils::getLocalPart(nodeName, localPart);
mLocalName = TX_GET_ATOM(localPart);
mNamespaceID = txNamespaceManager::getNamespaceID(aNamespaceURI);
}
oh, or how you do it for elements?
Woohooo! Support for disable-output-escape! (Calm down people, it's only for
standalone, not for mozilla).
In ElementAvailableFnCall and SystemPropertyFunctionCall you resolve prefix-
>namespace using wrong node, you use a node from the source document, but
should use the element from the xsl-stylesheet. I'd say only
change ::getNameSpace -> ::getPrefix in this patch and we'll do the right thing
later.
That's all except for the output handlers and the new member-variables to the
XSLTProcessor, but we'll take that on irc
hrm.. remove the last line from my code-snippet...
don't assert |NS_ASSERTION(element, "No element to add the attribute to.");| in txMozillaXMLOutput::attribute. That's not our fault but the fault of the stylesheet author. In txMozillaXMLOutput::characters, shouldn't you always mText.Append? So that the <title> get a real childnode? (or will it anyway) still more to come...
The list of empty elements isn't the same as the one in the XSLT spec, intentional? txXMLOutput::characters outputs too many CDATA start and end "tags". But that could wait for later IMHO.
ok, that's it except for the stuff i said on irc (just nsAutoString and <title> iirc)
Comment 31•23 years ago
|
||
I miss the Element::setAttributeNS hunk from http://bugzilla.mozilla.org/show_bug.cgi?id=96647#c19 For standalone, you lack to set mOut in XSLTProcessor.cpp:2205, which causes a crasher on stylesheets defaulting to HTML output. You need to copy the mOut from the old mOutputHandler to the new one. Get rid of ostream* mOutputStream; in txXMLOutput.h, AFAICT, it ain't used. The rest is fine with me.
Comment 32•23 years ago
|
||
source/xslt/Makefile.in needs @@ -38,15 +38,27 @@ content_xsl \ widget \ necko \ + unicharutil \ $(NULL) endif
Assignee | ||
Comment 33•23 years ago
|
||
Hopefully the last patch. I'll move this bug to the next milestone. Reviews and superreview are done and will need to be marked after final control of the new patch. I'll file follow-up bugs for a couple of things that are wrong/need improvement.
Attachment #58526 -
Attachment is obsolete: true
Attachment #63680 -
Attachment is obsolete: true
Comment 34•23 years ago
|
||
Comment on attachment 65618 [details] [diff] [review] v4 Module is fine, but a few changes in standalone need some tweaks. xml/dom/standalone/dom.h: make Element a friend of Attr: class Attr : public NodeDefinition { // AttrMap and Element need to be friends to be able to update the // ownerElement friend class AttrMap; friend class Element; public: xslt/XSLTProcessor.cpp: ::startElement (setting is more :-)) #ifdef TX_EXE ostream* out; mOutputHandler->getOutputStream(&out); delete mOutputHandler; mOutputHandler = new txHTMLOutput(); NS_ASSERTION(mOutputHandler, "Setting mResultHandler to NULL!"); mOutputHandler->setOutputStream(out); mResultHandler = mOutputHandler; #endif xslt/txHTMLOutput.cpp: remove mHTMLEmptyTags.put(txHTMLAtoms::col, txHTMLAtoms::col); or add col to the txHTMLAtoms. the standalone output handlers all need getOutputStream, I factored those into txStreamXMLEventHandler. I'll attach a patch for those, that's faster. with those, r=axel@pike.org
Attachment #65618 -
Flags: review+
Comment 35•23 years ago
|
||
Here is the promised patch. Sounds like this is the right solution to the issue with setOutputStream. getOutputStream was duplicated code, too.
You get a warning (which i think should be a builderror) on StackIterator attributeSets((txList*)&mAttributeSetStack); change it to StackIterator *attributeSets = mAttributeSetStack.iterator(); if (!attributeSets) return; (don't forget to delete) typo in source/xslt/makefile.win, change |unicharutils| to |unicharutil| You still have a few |if (mOutputFormat.mIndent)| in txXMLOutput.cpp with that and Pikes stuff, r=sicking
oh, and you add quite a few tabs to Document* XSLTProcessor::process(Document& xmlDocument, Document& xslDocument) :(
Actually, IMHO it would be ok to remove that function all together. I'm presonally not even convinced that we want it, and if/when we do it'll be easy to add. Your call...
Assignee | ||
Comment 39•23 years ago
|
||
Attachment #65618 -
Attachment is obsolete: true
Attachment #65641 -
Attachment is obsolete: true
Assignee | ||
Comment 40•23 years ago
|
||
Patch for bug 96647 should be final, so these will be fixed early in 0.9.9 (really!).
Target Milestone: mozilla0.9.8 → mozilla0.9.9
Comment on attachment 66002 [details] [diff] [review] v5 /me cheers r=sicking
Attachment #66002 -
Flags: review+
Comment on attachment 66002 [details] [diff] [review] v5 /me cheers r=sicking
Comment on attachment 66002 [details] [diff] [review] v5 /me cheers r=sicking
Comment 44•23 years ago
|
||
Comment on attachment 66002 [details] [diff] [review] v5 r=axel@pike.org
Assignee | ||
Comment 45•23 years ago
|
||
Comment on attachment 66002 [details] [diff] [review] v5 Marking jst's superreview.
Attachment #66002 -
Flags: superreview+
Comment 46•23 years ago
|
||
Here's a few nits I found while looking through this patch: - What's the deal with String::isEqualIgnoreCase()? It deals with unicode, it's only case-insensitive for ASCII characters. - In source/xml/dom/standalone/Document.cpp, inconsistent indentation. - In txMozillaTextOutput::setOutputDocument(), more error checking needed. - In txMozillaXMLOutput.cpp: +nsresult txMozillaXMLOutput::getRootContent(nsIContent** aReturn) +{ + NS_ENSURE_ARG_POINTER(aReturn); This is an internal method, right? If so, no need for the NS_ENSURE_ARG_POINTER, just don't pass null to this method. - In txOutputFormat.cpp: +void txOutputFormat::reset() +{ + mMethod = eMethodNotSet; + mVersion.clear(); + if (mEncoding.isEmpty()) + mOmitXMLDeclaration = eNotSet; no indentation after if?
Assignee | ||
Comment 47•23 years ago
|
||
Fix checked in. I'll be filing the follow-up bugs soon.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•