Closed Bug 94471 Opened 23 years ago Closed 23 years ago

document order info should move from DOMHelper to Node

Categories

(Core :: XSLT, defect, P1)

defect

Tracking

()

VERIFIED FIXED
mozilla0.9.6

People

(Reporter: axel, Assigned: sicking)

References

()

Details

(Keywords: perf)

Attachments

(4 files, 2 obsolete files)

We take a great hit in the performance jimmy by having order info being separate
from the node. They gotta be one object.
Order info needs to conform to this:
Documents are ordered consistently, see last paragraph in
http://www.w3.org/TR/xslt.html#function-document.
This could be done with comparing the document urls, but we may live with just
comparing the pointers.
http://www.w3.org/TR/xpath#dt-document-order says that document order goes
element - namespace nodes - attributes - children.
The order info should support that. Attributes have an order given by the 
document. Though Jonas would like to postpone that. (Please don't)
See bug 94456 on ramblings about attribute nodes.

Axel
Keywords: perf
Note to self: "attribute::*" needs to return a sorted nodeset
Status: NEW → ASSIGNED
Blocks: 95249
Severity: normal → critical
Keywords: mozilla1.0
Priority: -- → P1
Target Milestone: --- → mozilla0.9.4
adding some dependencies. For standalone I need some way of walking from an 
attribute node to it's ownerElement and that code currently lives in the patch 
for bug 76070
Blocks: 88964
Depends on: 76070
pushing
Target Milestone: mozilla0.9.4 → mozilla0.9.5
Depends on: 98704
No longer depends on: 76070
why put txOrderInfo in XMLDOMUtils? I'd prefer to have them in dom.h and the 
node impls. (XMLDOMUtils should be just as evil as DOMHelper.)
Is the function signature of txOrderInfo::compareTo really good?
How about method CompareOrderInfo(txOrderInfo* first, txOrderInfo* second), or
even Nodes? DOM3 has greetings, compareTreePosition.
txOrderInfo* Node::getOrderInfo() is way to malloc happy. You create and delete
the array for each recursion. Get the nodes on a stack, and allocate the 
PRUint32 array once.
The actual node sorting algorithm shouldn't rely on whether we cache the order
info, or not. Get that routine close to DOM3, at least in steps, if at all 
possible. 

Axel
> why put txOrderInfo in XMLDOMUtils? I'd prefer to have them in dom.h and the
> node impls. (XMLDOMUtils should be just as evil as DOMHelper.)

The reason i didn't want to put it in there is that we need the same code for 
both standalone and module. However we will hopefully in the future be able to 
use some moz-functions to do documentorder so i agree it's the way to go.

> Is the function signature of txOrderInfo::compareTo really good?
> How about method CompareOrderInfo(txOrderInfo* first, txOrderInfo* second), or
> even Nodes? DOM3 has greetings, compareTreePosition.

Good idea. I created a compareDocumentPosition(Node* aOther) function (didn't 
want to use the same as DOM3 since we don't quite do the same thing). So the 
OrderInfo class is now just a private struct in Node/NodeDefinition. This 
should make it very easy to switch to a mozilla implementation once one exist.

> txOrderInfo* Node::getOrderInfo() is way to malloc happy. You create and
> delete the array for each recursion. Get the nodes on a stack, and allocate
> the PRUint32 array once.

Actually i just create the array for each recursion, they are not deleted until 
the node goes away. Consider if we create OrderInfo for 100 siblings, instead 
of getting the orderinfo for the ancestors of those siblings 100 times i only 
do it once. This is the usual speed vs. memory tradeoff problem. I would say 
that for now we should go with speed, but once we don't rely on orderinfo so 
heavily we could reevaluate (actually then we might even be able to get rid of 
orderinfo entierly...)

> The actual node sorting algorithm shouldn't rely on whether we cache the order
> info, or not. Get that routine close to DOM3, at least in steps, if at all 
> possible. 

Done. I now just do node1->compareDocumentOrder(node2)
this missed the train
I said "this missed the train" damnit!
Target Milestone: mozilla0.9.5 → mozilla0.9.6
spawned bug 104219 for discussion of mozilla attribute nodes. That 
Attr::getChildIndex code looks bad, and we should be able to get that easily
without doing some extra bad foo in the wrappers.
PRUint32 Node::getChildIndex(Node* aParent)
could be 
PRUint32 Node::getChildIndex(), right?
The non-nsIContent fallback in module Node::getChildIndex deals with which 
nodes?
Didn't apply yet.
The |Node* aParent| argument is for performance. We do an |getParentNode| 
in ::getOrderInfo, so i didn't wanna do it again in ::getChildIndex.

The non-nsIContent fallback is for children of attribute nodes, I.E. in 
practice it'll probobly never happen. It's one of those things that can be 
improved once nsIAttr exists.
"fold that function" was the answer to #1, IIRC.
How about a NS_ASSERTION for that code we shouldn't call?
folding done. Didn't add the assertion since it's no illigal, only unlikly that 
it happens. It'll only happen for expressions like "foo/@bar/text()"
the source/xml/dom/mozImpl/MozillaAttr.cpp chunk is not supposed to be there,
right? (Looks like a nsdoom merge foo)

Could we get two constructors for OrderInfo? one for no parent order, and one
with parent order, using , mSize(aSize) syntax? That gives less cycles, IIRC.

Node::OrderInfo* Node::getOrderInfo() could use early returns instead of breaks
probobly->probably
I doubt that we can do foo/@bar/text() anyway. At least for module.

Don't
for (i = 0; i < parent->getChildNodes()->getLength(); ++i) 
instead do the sibling iterator dance.

For the record, on my machine, the veiligheid testcase of bug 95249 speeds up
by a factor of 10 for standalone. *bang*
> the source/xml/dom/mozImpl/MozillaAttr.cpp chunk is not supposed to be there,
> right? (Looks like a nsdoom merge foo)

err, sorry about that, leftovers from ::getChildIndex()

> Could we get two constructors for OrderInfo? one for no parent order, and one
> with parent order, using , mSize(aSize) syntax? That gives less cycles, IIRC.

Actually OrderInfo is just a struct without any constructors these days, so no 
cycles are lost.

> Node::OrderInfo* Node::getOrderInfo() could use early returns instead of
> breaks

hrm, to be honest i don't like either one, but I like breaks better for some 
reason, can't really say why... Would it be ok to keep the breaks?

> probobly->probably

Done

> I doubt that we can do foo/@bar/text() anyway. At least for module.

Actually, it works in both standalone and module. Which makes us fail two xalan 
axes tests, since xalan seems to assume that attributes don't have children.

> Don't
> for (i = 0; i < parent->getChildNodes()->getLength(); ++i) 
> instead do the sibling iterator dance.

Done

> For the record, on my machine, the veiligheid testcase of bug 95249 speeds up
> by a factor of 10 for standalone. *bang*

YAO!! I didn't think that the speedgain would be that big since peterv found 
those swaped-arguments-on-hashing bug. But this is GREAT news!
Comment on attachment 55339 [details] [diff] [review]
address Axels comments

please do the early returns, other than that
r=axel@pike.org
Attachment #55339 - Flags: review+
Comment on attachment 55339 [details] [diff] [review]
address Axels comments

>+/*
>+ * Compares document position of this node relative to another node
>+ */
>+PRInt32 Node::compareDocumentPosition(Node* aOther)
>+{
>+  OrderInfo* myOrder = getOrderInfo();
>+  OrderInfo* otherOrder = aOther->getOrderInfo();
>+  if (!myOrder || !otherOrder)
>+      return -1;

I'd do

+  if (!mOrderInfo)
+      getOrderInfo();
+  OrderInfo* otherOrder = aOther->getOrderInfo();
+  if (!mOrderInfo || !otherOrder)
+      return -1;

to avoid the getOrderInfo() in the most common case.
Your call though, not a showstopper.

>+    mOrderInfo->mSize = parentOrder->mSize+1;
Spaces around +

>+    for (n = 0; n < parentOrder->mSize; n++)
>+        mOrderInfo->mOrder[n] = parentOrder->mOrder[n];
You could just memcpy

>+                mOrderInfo->mOrder[n] = index + kTxChildIndexOffset;
Maybe kTxChildIndexOffset + index?

>+                sorted.add(k+1, node);
k + 1

r=peterv with Pike's and my comments fixed.
Attachment #55339 - Attachment is obsolete: true
Attachment #55042 - Attachment is obsolete: true
I didn't do the

> +  if (!mOrderInfo)
> +      getOrderInfo();

or

> Maybe kTxChildIndexOffset + index?

since i like the current way better. Everything else is done though
Blocks: 92652
Comment on attachment 55487 [details] [diff] [review]
address remaining comments

NodeDefinition.cpp includes string.h, that shouldn't happen.
other than that, r=me
yeah, right, it should.
r=me
- In Node::~Node(), and in NodeDefinition::~NodeDefinition(), in stead of
manually deleting mebemrs of members before deleting the member itself, why not
add a destructor to the member and let it deal with tearing itself down?

- In Node::OrderInfo* Node::getOrderInfo(), you declare a local variable named
'nsID', which is also a the name of a struct in mozilla (nsIID is typedefed to
nsID), so you *might* get warnings from this.

Other than that, sr=jst
Attachment #55487 - Flags: superreview+
checked in... Thanks for reviews people!
er, i guess i should mark it fixed too
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
we didn't verify for a long time.
I really checked, so VERIFIED.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: