Note: There are a few cases of duplicates in user autocompletion which are being worked on.
Bug 208172 (optim_xpath)

Implement optimizable XPath

RESOLVED FIXED

Status

()

Core
XSLT
RESOLVED FIXED
14 years ago
11 years ago

People

(Reporter: sicking, Assigned: peterv)

Tracking

(Blocks: 2 bugs, {perf})

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(2 attachments, 12 obsolete attachments)

3.57 KB, text/plain
Details
73.99 KB, patch
peterv
: review+
Details | Diff | Splinter Review
This bug is mainly for doing the initial work of optimized XPath. I plan to do
the infrastructure and a couple of easy optimizations to start it off. Some of
the optimizations I plan to do are listed in step 4 of

http://lxr.mozilla.org/mozilla/source/extensions/transformiix/docs/optimized-xpath.html

But I don't plan to do all of them in this bug (and some of them I'm not sure
should be done at all).

Comment 1

14 years ago
What is the timeframe to having a rev1 patch?
hopefully sometime next week
Created attachment 126317 [details] [diff] [review]
patch v1

This patch mainly adds the infrastructure for optimizing expressions and
patterns, not many optimizations are actually implemented yet. I did implement
the following optimizations mainly to test that the infrastructure was working:


1. Optimize expressions like "@foo" to either call 'getAttrNode', 'hasAttr' or
   'getAttr' (returning either a NodeSet, a StringResult or a BooleanResult)
   depending on what what kind of result we're interested in.

2. Optimize expressions like "@foo = 'hi'" to just call getAttr and then
compare
   without even building a StringResult

3. Optimize patterns like 'foo[bar]' to not contain a "real" predicate but
   instead embed the predicate into the txNodeTest.

The last optimization makes opening the testcase in bug 197956 take 30 seconds
instead of 3 minutes (yes folks, that's a factor 6).

Comment 4

14 years ago
More good stuff. Here is how we stand in recent XSLT performance fixes 
on my favorite customer testcase. Time in sec for doTransform(). Linux 
on 800 MHz w/ 1 GB memory. (Hope the table/data formats okay)

Description			Time	%improved
Original			2.59	0%
203229(*) Reuse strings		2.38	8%
205703(*) Reuse ExprResults	2.19	15%
205703(*) Recycle ExprResults	1.90	27%
206338(*) txStack tweak		1.87	28%
205810 LocalName lookup		1.74	33%
207377 endHTMLElement		1.62	37%
208172 Xpath1			1.53	41%

Comment 5

14 years ago
Comment on attachment 126317 [details] [diff] [review]
patch v1

Getting familiar with this. From the first glimpse, I know that I would like to
get my Parser/Lexer stuff done first. And let you handle the merge conflicts,
cause there are going to be a few. Looking at our patches, we should get a bug
for "review xpath again". We definitly need to get that done, IMHO.
I like the mods of giving the resulttype to the expr parsing.
I'm not so sure about the unprivatising of alot of members. Gotta have an
indepth look, but friend may be enough.

TX_DECL_TXAXPATHWALKCALLBACK is unreadable ;-). I guess I don't like
walkedExpr, I'd favour handleExpr and friends, I guess.
The stuff in MozillaElement.cpp looks really odd.

re the numbers, which part has that impact on the pretty print stylesheet?
What means (*)? landed? Guess so, is the non-landed stuff incremental or each
patch alone against the trunk?

Comment 6

14 years ago
Axel, yes, (*) is landed. Only this bug/patch (208172) provides huge 
improvement against pretty-print (197956). The times in my table are for the 
large bank application testcase that I've been working. All the times are 
incremental moving down through the list. Since the patches are independant 
from one another, subtraction gives good estimate of each patch alone. 
Microsoft XSLT does the testcase in about 0.6 sec, so althought good progress 
has been made, much more still needs to be done. (and if I read the 
pretty-print data right, we're still about 2X slower on that one too).
Sam
Upps, ment to cut out part of the stuff in MozillaElement.cpp, basically there
are two ways to implement Element::getAttrNode and I wanted to test which is
faster. I'll consider removing the #if 0-ed part the first reviewcomment ;-)

I don't like using friends since that'll defeate the purpose of having a generic
callback. We'd also end up adding a new friend to a pile of classes every now
and then.

Filing a separate bug for making XPath members into nsAutoPtr (and possibly
prefixing them with 'm') seems like a good way to get rid of the conflicts, I'll
whip up a patch right away.

why is TX_DECL_TXAXPATHWALKCALLBACK unreadable? What don't you like about
walkedExpr?

The part that affects the prettyprint stylesheet is my #3 optimization in
comment 3. That makes us take the |isEmpty()| shortcut in
txStepPattern::matches, which saves us a ton of time.
Status: NEW → ASSIGNED

Comment 8

14 years ago
> why is TX_DECL_TXAXPATHWALKCALLBACK unreadable? What don't you like about
> walkedExpr?

TXAXPATHWALKCALLBACK consists of 5 different mnemonics or words (counting both
tx and A, as both hold information, IMHO), without an indication of where the
boundaries are in the macro. TX_DECL_A_XPATH_WALK_CALLBACK is much more like it.
(I contracted the two tx into one.)

iterateSubItems(txAXPathWalkCallback* aCallback) is the signature of the entry
point. You make the Exprs implement some kind of walker, but this should be clear.
I don't like having a "walked" as it is indicating an order of walking and 
calling. You call a WalkCallback after you walked it. Sounds like a bug.
Anyway, the method name and it's callback should fit. I guess finding a good
name is hard, but we should come up with one. There is no such thing as an
item in XPath, for example. I don't have a good idea right now. But I know that
there are at least two ends of that naming that I consider bad.

Note that the comment to that method in Expr.h should make clear what nsresults
are expected as well as who is supposed to call into the callback. Esp. who
calls the callback for this.

Is the original developer really IBM? It may be the original copyright holder,
but IMHO, the original developer is you. Your name doesn't show up at all, which
is a bad idea for people that don't know "blame sicking".

Comment 9

14 years ago
> The part that affects the prettyprint stylesheet is my #3 optimization in
> comment 3. That makes us take the |isEmpty()| shortcut in
> txStepPattern::matches, which saves us a ton of time.

Looking at the construction of the context nodeset I see quite some hefty
iteration over the siblings for stuff like
<xsl:template match="*[node()]">
(collect all children for all children, smells like O(n^2) just for this.)
Seems like we're not up for surprises, huh ;-)?
TX_DECL_TXAXPATHWALKCALLBACK follows the naming convention for all mozilla
interfaces so i'd prefer to stick to that.

I still don't understand what issues you have with
iterateSubItems/txAXPathWalkCallback, is it just the names of the
methods/interfaces?

The licenceplate is what IBM wants me to write, so neither of us have a saying
in that.
Depends on: 210528

Comment 11

14 years ago
ouchouch. Bad me.
I prolly should have reviewed the design doc before sicking had a chance to do
code.

Most of the optimization classes handle one case, the evaluation of a step does 
not depend on context node size.
In this case, we don't need to evaluate the complete nodeset of the nodetest.

(I see that the union case is different, maybe others, too. But I don't think
that any of that is in the current patch.)

Now, sicking tried to deal with it by creating a few classes that special case
specific versions of this situation, which, AFAICT, is not neccessary.
Instead, we should iterate over the axis, test the nodetest and conditionally
feed the node and it's proximity position into a handler. That handler should 
return (prolly an PRUint32) a bitmask indicating GoodOrBad and StopOrContinue. 
At least, that's the cases I see. This particular handling even works well for
piping within predicates. And it seems much more straight forward to me than
to search for special cases.
I need a bit of emphasis that predicates work like pipe or on a nodeset
dependent on their context sensitivity. LocationSteps should be able to act as
both. Within a PathExpr, they should prolly be pipe-things. So they work as 
pipe, and as soon as they end up with a predicate that needs the context-size,
they collect the full nodeset.
Note that the optimization to getAttributeNode is not helping more than a 
go-over-attribute-axis-and-stop-on-match, as that is what GetAttribute in
the mozilla DOM does anyway. (And so does standalone.)

So right now, I only see one optimized txOptimLocationStep that doesn't generate
nodesets but instead feeds the nodes and their proximity position to a handler.
Note that one of these handlers could be a nodeset, gathering up the content.

Another thought or two need to go to how the optimisation of creating multistep
simple paths like foo/var/bar would go. Maybe we could add a "blind" bit on
the resulting nodesetsink that just adds instead of appends.
That could be prolly done whenever the last step is a one-depth axis, eg. child
or attr. Right?


One the actual code:
The optimizer goes over the hierarchy too often, IMHO. You walk the complete 
tree for the sensitivity stuff and the optimization and for the optimizeStep.
AFAICT.
The sensitivity shouldn't be gotten completely, I'd say. So instead of having
virtual ContextSensitivity getContextSensitivity() = 0;
I'd prefer a 
virtual PR_Bool isContextSensitive(ContextSensitivity aMask) = 0;
That way, once we know we are sensitive, we don't have to check the rest, for
example in function arguments or predicates or so.

I'm not too happy with the way we hop over the parsed expression. Maybe we 
should do this more like in txStylesheetCompiler, with 
startExpr(aExpr, aType), endExpr() or so. Not sure how and if we wanna or need
to jump ship when going over to the nodetests. Though, AFAICT, we don't ever
need to optimize those if they just work as piping predicates. Which they seem
to be to me.

Sorry, Jonas. Really. I don't like to spoil the fun.
Regarding predicates:
The main optimization from predicates comes from the fact that once all
predicates are removed from a Step we can apply further optimizations on it.

For example the prettyprint stylesheet got 6x faster simply because when we do
matching in txStepPattern IsEmpty will return true and we'll return early.

We'll also be able to optimize UnionExprs and PathExprs containint '//' once
predicates are 'removed' from a step (both of these are explained in my
optimization doc).

What you are talking about is approximatly the "Predicates that are only
context-list-position sensitive and not context-list-size sensitive" class. I
havn't prioritezed that though since it would only help for predicates that
depend on the context-list which is pretty rare.


I like the getContextSensitivity vs. isContextSensitive suggestion, I'll look
into if that is doable.


I don't really understand the startExpr/endExpr suggestion? There are no
nodesets at this point, the execution of the stylesheet hasn't started yet.

Comment 13

14 years ago
> The main optimization from predicates comes from the fact that once all
> predicates are removed from a Step we can apply further optimizations on it.

This should be independent of how we implement an optimisation. If we create an
almost-a-step, or if we use a step-with-better-predicates should not make a 
difference.

Regarding "//", it might be a bad idea to have PathExpr implement that instead
of LocationStep. As axes and '//' overlap in their source tree usage, putting
them into separate objects could be regarded as design bug.
Predicates don't change the axis of the LocationStep, so they shouldn't make any
difference. I'm not sure that we are required to hork the context nodeset in 
this scenario, but, as Humptydumpty said, "I'd rather see that done on a piece
of paper."

From what I can tell, proximity-poisition-dependency is only relevant for 
patterns, which I hadn't looked at patterns at all then. Looking at it now,
creating the context nodeset tailored to the predicate in question looks
much more promising than the txPredicatedNodetest.

From a design perspective, I would much rather like to see the individual
ExprParser methods do the optimisations, or call into optimisation routines,
than to have a separate class walk all over the final beast and disassemble and
reassemble their innards.
(Which gets rid of the start- and endExpr foo, which was talking about nodetests
instead of nodesets. Which is the only way for me to make sense of your last
statement).

Comment 14

14 years ago
Permit me to add my perspective. When Jonas and I started on fixing 
XSLT performance, it was about 6X slower than Microsoft XSLT. With the 
changes he and I made, now it is only (cough, cough) twice as slow. If 
and when further improvements provide performance parity with the 
Microsoft engine, then 'design astetics', in itself, can have higher 
value. However, at this point in time, faster code is goodness. Designs 
or implementations resulting in slower code are unacceptable. Design or 
implementation changes which are performance-neutral, can be 
accomodated to the extent they do not impede further work and progress 
in XSLT performance improvement. 
Sam Emrick, Senior Software Engineer, IBM Corp
(Assignee)

Comment 15

14 years ago
Let me add my perspective then. We managed to improve performance by at least a
factor 13 since we started working on performance (and that is before Jonas
started working for IBM). I think it is very much the module owner, his peers
and other code contributors who decide what happens in a module, however
impatient you are. Code speaks louder than words, elegant code even more so and
until today I haven't seen a lot of either from you. Axel and Jonas on the other
hand have been very good at good design and at following it up with code that
improves performance, I don't see why they can't continue doing so and I hope
they won't give up the design discussions. We really don't need hyperboles about
aesthetics in a bug report.

Comment 16

14 years ago
Calm down, Peter, we 'are on the same side' here. (to help me in this 
regard, I took lunch before responding. I am glad that I did that.) 

I think we mostly in violent agreement here. I believe we all agree, 
the Transformiix XSLT engine used by Mozilla started life severely  
performance-challenged. Also we can agree, much work has been done, by
many persons, to improve that situation. I hope we also agree, although
much progress has been made, performance is still a problem, and more
performance improvement is still desirable. I hope we can agree that 
anything regressing performance should be carefully deliberated. I 
think there is nothing I said above that is cause for much debate, yes?

Standard corporate disclaimer needed here. I do not speak for nor 
represent the IBM corp. Personal opinion wise, we hired Jonas because 
we felt it was important to improve Mozilla XSLT performance, and this 
was a way for us to directly support that on-going effort. It afforded 
a key person in that effort the ability to more focus on it, and it 
brought that skill to where he has access to better performance tools 
and expert performance skill. That's all. There's no deeper agenda 
here.  

I don't do 'elegant code'. My skill is to tease out the critical bits 
of code most driving the performance equation. For any software bundle, 
in the gobs and gobs of code, only some small pieces of that are 
critical to performance. Against these bits, I produce ideas and code 
hacks that do better. Faced with improvement numbers and an ugly hack, 
then you 'elegant code guys' will go and do the right thing. Meanwhile 
I am off chasing the next culprit. I don't need to 'follow up', because 
I have 'handed off'... and I'm confident in y'all handling it after 
that. I apologize if that somehow felt to you as lack of follow-up.          

Right now, I'm off xslt, at least for the moment, working on fixing 
linux mozilla. Xslt is in reasonable shape there, but for reasons I am 
still ferreting-out, layout is in the dumpers. My goal is to produce 
some hacks with that too. The challenge will more be to find some 
people who care about linux mozilla... in the way, and as much as, that  
y'all care about xslt. Its been good working with you. Keep up the good 
work and make it even faster. Do not accept ideas that make slow code.




Comment 17

14 years ago
more comments on the patch:
+    Expr::ResultType getReturnType(); \

is it Result or Return type? Not sure if this should be IsOfType.

Please implement this as a baseclass whenever possible. 
class LocationStep : public PredicateList, public txNodeSetExpr 
instead of
class LocationStep : public PredicateList, public Expr 

That makes it more apparent and should reduce code size.

enum ExprType is not really a type, but the class.

Comment 18

14 years ago
On to more:
I like the concept of txPredicatedNodeTest. But it should be able to deal with 
more than just one Predicate. [@foo][achildnode] should work, too.
Something like
class txPredicateHelper
{
public:
    nsAutoPtr<Expr> mExpr;
    nsAutoPtr<txPredicateHelper> mNext;
}

class txPredicatedNodeTest : public txNodeTest, public (or private)
txPredicateHelper
{
}

Have a add method for the predicates, and don't use nsAutoPtr<>, only refs.
I wonder if that part would compile on gcc, as you use a pointer as argument to 
a nsAutoPtr<>. That does not work in function calls, I wonder if it would work
in constructors.

On the optimizer. There are optimisations that want to go bottom-to-top, like
the txPredicatedNodeTest, but there are other optimisations that want to go
top-to-bottom, like finding the topmost expression that is constant in a 
predicate, which would make sense to be evaluated upfront. Not sure if it is
able to handle both.
Something like foo[hi[contains(., substring($globalString,
string-length($globalVar))]] 
should probably factor the substringcall, but not the string-length call.
(Yikes example, I hope you get the idea)

Given the state of both your and my mind, could we rip out that part of the 
patch and just hardcode the predicate optimisation into ExprParser? I'd love 
that, it'd be cheaper and we could take a bit more time to raise a fuzz about
the other stuff.

oops ;-)
\ No newline at end of file

Two dreams are open for other patches, I guess.
1st, I'd like to have the txNodeTest be able to hint to stop further evaluation. 
That'd be soo cool for foo[1].
And I would like to be able to keep the proximity positions in the 
txPredicatedNodeTest. But that is a tricky lil' bastard.
> more comments on the patch:
> +    Expr::ResultType getReturnType(); \
> is it Result or Return type? Not sure if this should be IsOfType.

I'll rename it getResultType and check if i can go with something like IsOfType.

> enum ExprType is not really a type, but the class

Not sure what you mean with this, we call similar things type elsewhere. Would
you prefer ExprClass?

Yes, the optimizer can handle both bottom-to-top and top-to-bottom
optimizations, to do top-to-bottom you just perform the optimization on the
expresion before walking it's subexpressions.

In the tree I have here i've already changed the optimizer to deal with multiple
predicates, it's as easy as chaning a |if| to a |while|.
Created attachment 132238 [details] [diff] [review]
real patch v1

Ok, this one is ready for reviews. The following is different from the previous
patch:

* Implemented a few more optimizations:
  * Optimizes as many predicates as possible into the nodetest for steps
  * Don't evaluate righthandside expression of a RelationalExpr if the
    lefthandside is an empty nodeset
  * Optimize ".//foo" into "./descendant::foo"
  * Optimize ".//.[@bar]" into "./descendant-or-self::node()[@bar]"
      (probably not very common but it only required 4 extra lines of code)
  * Optimize "./*" into "*"

* Changed the flags in Expr a bit so that you don't need to test if an
  expression depends on either contextnode or contextdocument, a single
  value will do now. Same for context nodeset.
Attachment #126317 - Attachment is obsolete: true
Comment on attachment 126317 [details] [diff] [review]
patch v1

Oh, I also implemented two of Pikes suggestions, Expr now have |PRBool
isSensitiveTo(ContextSensitivity)| and |PRBool canReturnType(ResultType
aType)|.
Attachment #132238 - Flags: superreview?(peterv)
Attachment #132238 - Flags: review?(axel)
Alias: optim_xpath
btw, canReturnType should possibly be renamed canReturnResultType.

Comment 23

14 years ago
> * Implemented a few more optimizations:
>   * Optimizes as many predicates as possible into the nodetest for steps

with recursion, bah. That's hard to debug and you dive thru N virtual
function calls for N predicates without a single test.

>   * Don't evaluate righthandside expression of a RelationalExpr if the
>     lefthandside is an empty nodeset

That's independent of the essence of this bug, right?

>   * Optimize ".//foo" into "./descendant::foo"
>   * Optimize ".//.[@bar]" into "./descendant-or-self::node()[@bar]"
>       (probably not very common but it only required 4 extra lines of code)
>   * Optimize "./*" into "*"

I'm not convinced that fixing up stylesheet errors is worth the effort or
that it should be done in a browser.

txNamedAttributeStep should still go.

iterateSubItems is in no way related to its callback walkedExpr, that these
are paired is unnecessarily hard to find out and thus to maintain.

The optimisation that is real here is the txPredicatedNodeTest.
Walking over the complete expression to get this is wrong. Every other
optimisation that may or may not need that needs to show that it is worth
that cost for me first. In the context of all the nits we have perf-wise,
walking (or was it iterating?) over each part of all generated expressions
and patterns is just out of scope.

Updated

14 years ago
Attachment #132238 - Flags: review?(axel) → review-
> > * Implemented a few more optimizations:
> >   * Optimizes as many predicates as possible into the nodetest for steps
> with recursion, bah. That's hard to debug and you dive thru N virtual
> function calls for N predicates without a single test.

Since i've never seen a realworld stylesheet use more then one predicate i
didn't think it was very important to optimize for this. However since it's so
easy to change an |if| to a |while| i figured it was worth it.

> >   * Optimize ".//foo" into "./descendant::foo"
> >   * Optimize ".//.[@bar]" into "./descendant-or-self::node()[@bar]"
> >       (probably not very common but it only required 4 extra lines of code)
> >   * Optimize "./*" into "*"
> I'm not convinced that fixing up stylesheet errors is worth the effort or
> that it should be done in a browser.

I care about optimizing realworld stylesheets, nothing else. If people write
unoptimal stylesheets we'll have to live with it and still try to execute them
as fast as we can. And I would say that expressions like '//foo' is fairly
common. The "./*" is probably not too common though I've seen it on more then
one occation. However ".//foo" is fairly common which gets optimized to
"./descendant::foo" which is why i thought it was worth to add the '.'
optimization so that that gets optimized into just "descendant::foo"

> iterateSubItems is in no way related to its callback walkedExpr, that these
> are paired is unnecessarily hard to find out and thus to maintain.
I have no problem with renaming these if you have better names.

The reason i'm optimizing by walking over the expression is many:

First off I want to keep optimization separate from parsing to keep the code
clear and reduce the risk of regressions.

Second there are some optimizations we won't be able to do during parsing, such
as inline global variables.

Third, we will need the infrastructure to iterate over parsed expressions anyway
to be able to optimize variables, local as well as global.


To do the last to optimizations i'm going to move the call to
txXPathOptimizer::optimize out of the parser and into a
post-stylesheet-compilation step where we optimize all expressions as well as
the stylesheet itself (for example resolve crossreferences such as templatenames).
(Assignee)

Comment 26

14 years ago
I'll take a look at this soon and blurt out some module-owner opinions. In
general I like the idea of an external optimizer.

Comment 27

14 years ago
> Third, we will need the infrastructure to iterate over parsed expressions anyway
> to be able to optimize variables, local as well as global.

What is this about? I only find
http://lxr.mozilla.org/seamonkey/source/extensions/transformiix/docs/optimized-xpath.html#usecase18
All of the optimizations in the optimized xpath doc that requires knowledge of
what datatype a subexpression returns (for example 3), or what datatype is used
by a parentexpression (for example 10, 11, 12) can't be done for variables at
parsetime.
Blocks: 197956
Comment on attachment 132238 [details] [diff] [review]
real patch v1

rerequesting review since we're closing in on 1.6b. I also think that Pikes and
my optimizations are at least slighly orthogonal and should both be done.
Attachment #132238 - Flags: review- → review?(axel)
Created attachment 135960 [details] [diff] [review]
sync to tip

This syncs to tip and the post-walker world. I also made one modification: I
put back getReturnType but kept an inlined canReturnType. This way the code got
slightly cleaner and we have the ability to get the exact returntype in case we
need it.
Attachment #132238 - Attachment is obsolete: true
Attachment #132238 - Attachment is obsolete: false
Attachment #132238 - Flags: superreview?(peterv)
Attachment #132238 - Flags: review?(axel)
Attachment #132238 - Attachment is obsolete: true

Comment 31

14 years ago
re the .//bar -> descendant::bar, I investigated the docbook stylesheet a bit.
On the trunk, xsl:number is *not* the primary suspect, but .//bar is. I created
a stylesheet version of that optim and got a speedup by a factor of about 2,
for both module and standalone.

Appearantly, we should fix stupid stylesheets. Sigh.
Attachment #135960 - Flags: superreview?(peterv)
(Assignee)

Comment 32

14 years ago
(In reply to comment #28)
> All of the optimizations in the optimized xpath doc that requires knowledge of
> what datatype a subexpression returns (for example 3), or what datatype is used
> by a parentexpression (for example 10, 11, 12) can't be done for variables at
> parsetime.

Could you explain this better, preferably with an example?
XPath that requires knowledge of what datatype a subexpression returns:

In some cases we can't perform an optimization without knowing what datatype a
subexpression returns. One example of this is usecase 3;

a[@href]

In this case we want to be able to turn the predicate into a nodetest. That way
further optimizations can do faster patternmatching. I.e. the |if| statment at
http://lxr.mozilla.org/mozilla/source/extensions/transformiix/source/xslt/txXSLTPatterns.cpp#478
would return true.
However we can't do that if the subexpression might return a number. I.e the
pattern/expression. I.e. we need to check with the '@href' expression if it can
return a number. If it can we must keep it as a predicate. So for expressions like;

a[$var]

we won't know at parsetime what datatype '$var' is. This is especially true if
$var is a global variable.


XPath that requires knowledge of what datatype a parent expression uses:

Some expressions can be optimized if we know how the resulting value will be
used. For example in usecase 12

starts-with(processing-instruction(), 'hello')

We could evaluate 'processing-instruction()' Step faster if we knew that we only
care about the stringvalue of the result. We would only need to find the first
node that matches the NodeTest and get its stringvalue and return that.

Again, we can't do that at parsetime when variables are involved. For example:
<xsl:variable name="var" select="processing-instruction()"/>
<xsl:if test="starts-with($var, 'hello')"/>

while parsing the 'processing-instruction()' expression we won't have any idea
how it's used. This would have to be done after the entire stylesheet is parsed.


My point with comment 25 was that some optimizations can't be done at parsetime,
but needs to be done after the entire stylesheet has been parsed.

I don't remember if any of these optimizations are done in this patch (I don't
think they are), but that was one of the reasons i wrote the code the way I did.
actually, the code does perform both types of those optimizations, i.e.
optimizations where we need to know what type a subexpression returns or where
we need to know how a parentexpression will use the resulting object.

However i don't think that it does any optimizations where it can't figure out
those things at parsetime (albeit with some hassle). In other words it doesn't
do any of those optimizations on variables currently.

Comment 35

14 years ago
Created attachment 145090 [details]
alternative proposal

Peter asked me to attach this.
This lays out how I envision a simpler way to do xpath optim and why I think
that 
simpler is better.
(Assignee)

Comment 36

13 years ago
Created attachment 170954 [details] [diff] [review]
sync to tip
Attachment #135960 - Attachment is obsolete: true
(Assignee)

Comment 37

13 years ago
Created attachment 170957 [details] [diff] [review]
alternative v1
(Assignee)

Comment 38

13 years ago
* trunk

Size: 497964

chess-fo/chess.xsl	591.94 ms	20126 ms	13.43
docbook-xsl/html/docbook.xsl	6227.29 ms	43591 ms	58.87
jenitennison/page.xsl	44.68 ms	24439 ms	3.19
jenitennison/markme.xsl	46.38 ms	8627 ms	0.97
schematron/schematron-basic.xsl	15.27 ms	12445 ms	0.49
schematron/wai.xsl	4.03 ms	13901 ms	0.31
spec-html/xmlspec.xsl	749.61 ms	17241 ms	37.69
xsltdoc/xsltdoc.xsl	282.93 ms	36215 ms	2.23
mathml/mathmlc2p.xsl	24.58 ms	19173 ms	2.22
dsd/article2html.xsl	15.61 ms	14904 ms	1.30
nitf/nitf-stylized.xsl	10.12 ms	9942 ms	0.39
rdf/rdft.xsl	4.47 ms	10972 ms	0.50
recipes/recipes.xsl	14.56 ms	12139 ms	0.51
sp.xsl/sp.xsl	8.59 ms	21694 ms	0.58
topic-xtm/cogx2xtm.xsl	14.55 ms	10988 ms	0.64

Total: 276397 ms


* sicking

Size: 552960 (+11.04%)

chess-fo/chess.xsl	602.47 ms	20484 ms	17.13
docbook-xsl/html/docbook.xsl	3335.57 ms	23349 ms	46.93
jenitennison/page.xsl	44.37 ms	24269 ms	3.24
jenitennison/markme.xsl	46.58 ms	8664 ms	1.12
schematron/schematron-basic.xsl	14.74 ms	12015 ms	0.61
schematron/wai.xsl	3.46 ms	11947 ms	0.50
spec-html/xmlspec.xsl	794.61 ms	18276 ms	9.60
xsltdoc/xsltdoc.xsl	212.31 ms	27176 ms	3.30
mathml/mathmlc2p.xsl	22.05 ms	17196 ms	16.31
dsd/article2html.xsl	14.97 ms	14297 ms	0.54
nitf/nitf-stylized.xsl	10.33 ms	10141 ms	0.51
rdf/rdft.xsl	3.78 ms	9281 ms	0.45
recipes/recipes.xsl	13.36 ms	11139 ms	0.50
sp.xsl/sp.xsl	8.59 ms	21688 ms	0.65
topic-xtm/cogx2xtm.xsl	14.49 ms	10943 ms	0.64

Total: 240865 ms (-12.85%)


* alternative

Size: 548864 (+10.22%)

chess-fo/chess.xsl	584.91 ms	19887 ms	14.13
docbook-xsl/html/docbook.xsl	3075.86 ms	21531 ms	30.28
jenitennison/page.xsl	43.53 ms	23812 ms	3.00
jenitennison/markme.xsl	45.44 ms	8451 ms	0.78
schematron/schematron-basic.xsl	14.41 ms	11746 ms	0.51
schematron/wai.xsl	3.30 ms	11397 ms	0.46
spec-html/xmlspec.xsl	776.26 ms	17854 ms	173.92
xsltdoc/xsltdoc.xsl	200.17 ms	25622 ms	2.50
mathml/mathmlc2p.xsl	21.31 ms	16620 ms	15.12
dsd/article2html.xsl	14.51 ms	13858 ms	0.55
nitf/nitf-stylized.xsl	10.10 ms	9919 ms	0.39
rdf/rdft.xsl	3.38 ms	8306 ms	0.49
recipes/recipes.xsl	13.17 ms	10982 ms	0.45
sp.xsl/sp.xsl	8.47 ms	21379 ms	0.55
topic-xtm/cogx2xtm.xsl	13.85 ms	10460 ms	0.63

Total: 231824 ms (-16.12%)


I'll try out pretty-printing too.
I do like the use of evaluateToBoolean/evaluateToString, but couldn't it be done
separatly to this bug? Or does it need some of the stuff that are added to the
Expr 'interface'? I'm ok with doing it all in one though since it's all together
now.

I don't like inserting optimizor stuff in the parser though. IMHO it complicates
something that is complicated enough as it is, and it's going to get worse the
more optimizations we add. Also, some optimizations can't be done until
surrounding variables are parsed, for example constant expansion. I'd much
rather change the way that the txXPathOptimizer walks over the created
expressions since that's what it seemed like you guys disliked the most.

Updated

13 years ago
Blocks: 265212

Updated

13 years ago
Blocks: 284708
(Assignee)

Comment 40

12 years ago
Created attachment 205252 [details] [diff] [review]
sync to tip
Attachment #170954 - Attachment is obsolete: true
(Assignee)

Comment 41

12 years ago
Created attachment 205253 [details] [diff] [review]
alternative (sync to tip)
Attachment #170957 - Attachment is obsolete: true
Created attachment 205687 [details] [diff] [review]
New and improved
Attachment #205252 - Attachment is obsolete: true
Attachment #205687 - Flags: review?
Comment on attachment 205687 [details] [diff] [review]
New and improved

This implements what peterv and I have talked about.
Attachment #205687 - Flags: review? → review?(peterv)
(Assignee)

Updated

12 years ago
Attachment #135960 - Flags: superreview?(peterv)
Comment on attachment 205687 [details] [diff] [review]
New and improved

I just noticed i forgot to remove the Type enum from txNamedAttributeStep. It's unused though so I won't attach a new patch just to get it out of the header.
(Assignee)

Comment 45

12 years ago
Comment on attachment 205687 [details] [diff] [review]
New and improved

>Index: base/txList.cpp
>===================================================================

> /**
>+ * Replaces the Object last returned by the next() or previous() methods
>+ * with a new Object.
>+ * If the iterator is outside the list null is returned.
>+ * @param aObjPtr new Object to put in the list
>+ * @return the old Object pointer
>+ */

No need for this in the implementation file, it's in the header.

>+} //-- remove

Ugh, don't add more of these silly comments.

>Index: base/txList.h
>===================================================================

>+    inline PRBool isEmpty()
>+    {
>+        return !itemCount;

I prefer == 0.

>+    void* replace(void* aObjPtr);

Hmm, you could add replace(PRUint32, void*) in txList instead? (you always advance before replacing)

>Index: xpath/txExpr.h
>===================================================================

>+    enum ExprType {

>+        UNKNOWN_EXPR

Maybe OTHER_EXPR?

>+    virtual Expr::ResultType getReturnType() = 0;

s/Expr:://

>+     * Returns true if this expression is sensitive to *any* of
>+     * the requested flags.

"to any of the requested contexts in aContexts."

>+     */
>+    virtual PRBool isSensitiveTo(ContextSensitivity) = 0;

Give a name to the argument (eg aContexts).

>+#define TX_DECL_EXPR_BASE \
>+    nsresult evaluate(txIEvalContext* aContext, txAExprResult** aResult); \
>+    Expr::ResultType getReturnType(); \

s/Expr:://

>+    PRBool isSensitiveTo(ContextSensitivity)

Give a name to the argument (eg aContexts).

>+#define TX_DECL_OPTIMIZABLE_EXPR \
>+    TX_DECL_EXPR; \
>+    Expr::ExprType getType()

s/Expr:://

>+#define TX_IMPL_EXPR_STUBS_1(_class, _ReturnType, _Expr1)     \
>+TX_IMPL_EXPR_STUBS_BASE(_class, _ReturnType)                  \
>+Expr*                                                         \
>+_class::getSubExprAt(PRUint32 aPos)                           \
>+{                                                             \
>+    if (aPos == 0) {                                          \
>+      return _Expr1;                                          \

Four-space indentation. There's other changes in this file that need their indentation changed.

>+#define TX_IMPL_EXPR_STUBS_2(_class, _ReturnType, _Expr1, _Expr2) \
>+TX_IMPL_EXPR_STUBS_BASE(_class, _ReturnType)                  \
>+Expr*                                                         \
>+_class::getSubExprAt(PRUint32 aPos)                           \
>+{                                                             \
>+    switch(aPos) {                                            \
>+      case 0:                                                 \
>+        return _Expr1;                                        \
>+      case 1:                                                 \
>+        return _Expr2;                                        \
>+      default:                                                \
>+        break;                                                \
>+    }                                                         \
>+    return nsnull;                                            \
>+}                                                             \

      default:
        return nsnull;
      }
}

>+    PRBool argsSensitiveTo(Expr::ContextSensitivity);

Give a name to the argument (eg aContexts).
s/Expr:://


>@@ -227,54 +395,66 @@ public:

>+    /**
>+     * Returns the type of this expression.

nodetest?

>+     */
>+    enum NodeTestType {

>+    virtual NodeTestType getType()
>+    {
>+      return UNKNOWN_TEST;

OTHER_TEST?

> #define TX_DECL_NODE_TEST \
>-    TX_DECL_NODE_TEST_BASE; \
>-    void toString(nsAString& aDest)
>-#endif
>+    TX_DECL_TOSTRING \
>+    PRBool matches(const txXPathNode& aNode, txIMatchContext* aContext); \
>+    double getDefaultPriority(); \
>+    PRBool isSensitiveTo(Expr::ContextSensitivity aContext); \
> 

Drop the last \

>+    txPredicatedNodeTest(txNodeTest* aNodeTest,
>+                         Expr* aPredicate);

Put this on one line.

>+class txNamedAttributeStep : public Expr

>+private:
>+    PRInt32 mNsID;

mNamespace, be consistent with the other classes.

>+class txLocalNameFilter : public Expr
>+{
>+public:
>+    txLocalNameFilter(nsAutoPtr<Expr> aExpr, nsAutoPtr<Expr> aLocalName);

References please.

>Index: xpath/txFilterExpr.cpp
>===================================================================

>+PRBool
>+FilterExpr::isSensitiveTo(Expr::ContextSensitivity aContext)

s/Expr:://

>Index: xpath/txFunctionCall.cpp
>===================================================================

>+PRBool
>+FunctionCall::argsSensitiveTo(Expr::ContextSensitivity aContext)

s/Expr:://

>Index: xpath/txLiteralExpr.cpp
>===================================================================

>+Expr::ResultType
>+txLiteralExpr::getReturnType()
>+{
>+    switch (mValue->getResultType()) {
>+        case txAExprResult::NODESET:
>+            return NODESET_RESULT;
>+
>+        case txAExprResult::BOOLEAN:
>+            return BOOLEAN_RESULT;
>+
>+        case txAExprResult::NUMBER:
>+            return NUMBER_RESULT;
>+
>+        case txAExprResult::STRING:
>+            return STRING_RESULT;
>+
>+        case txAExprResult::RESULT_TREE_FRAGMENT:
>+            return RTF_RESULT;
>+    }
>+
>+    NS_NOTREACHED("how'd we get here?");
>+    return ANY_RESULT;
>+}

Would you be willing to switch this to a static array, using mValue->getResultType() as index (see my alternative patch)?

>+PRBool
>+txLiteralExpr::isSensitiveTo(Expr::ContextSensitivity aContext)

s/Expr:://

>Index: xpath/txLocationStep.cpp
>===================================================================

>+Expr::ExprType
>+LocationStep::getType()
>+{
>+  return mAxisIdentifier == ATTRIBUTE_AXIS ?
>+      LOCATIONSTEP_ATTRIBUTE_EXPR :
>+      LOCATIONSTEP_OTHER_EXPR;

Line this up correctly (with mAxisIdentifier).

>+PRBool
>+LocationStep::isSensitiveTo(Expr::ContextSensitivity aContext)

s/Expr:://

>Index: xpath/txMozillaXPathTreeWalker.cpp
>===================================================================

> PRBool
>+txXPathTreeWalker::moveToNamedAttribute(nsIAtom* aLocalName, PRInt32 aNSID)
>+{
>+    if (!mPosition.isContent()) {
>+        return PR_FALSE;
>+    }
>+
>+    PRUint32 total = mPosition.mContent->GetAttrCount();
>+    PRInt32 namespaceID;
>+    nsCOMPtr<nsIAtom> name, prefix;
>+    PRUint32 index;
>+    for (index = 0; index < total; ++index) {
>+        mPosition.mContent->GetAttrNameAt(index, &namespaceID,
>+                                          getter_AddRefs(name),
>+                                          getter_AddRefs(prefix));
>+
>+        if (namespaceID == aNSID && name == aLocalName) {

API changed, needs to become:

    PRUint32 total = mPosition.mContent->GetAttrCount();
    PRUint32 index;
    for (index = 0; index < total; ++index) {
        const nsAttrName* attrName = mPosition.mContent->GetAttrNameAt(index);
        if (attrName->Equals(aLocalName, aNSID)) {

>Index: xpath/txMultiplicativeExpr.cpp
>===================================================================

>+PRBool
>+MultiplicativeExpr::isSensitiveTo(Expr::ContextSensitivity aContext)

s/Expr:://

>Index: xpath/txNamedAttributeStep.cpp
>===================================================================

>+nsresult
>+txNamedAttributeStep::evaluate(txIEvalContext* aContext,
>+                               txAExprResult** aResult)
>+{
>+    *aResult = nsnull;
>+
>+    nsresult rv = NS_OK;
>+    const txXPathNode& node = aContext->getContextNode();

No need for this variable.

>+
>+    nsRefPtr<txNodeSet> nodes;
>+    rv = aContext->recycler()->getNodeSet(getter_AddRefs(nodes));

Declare rv here.

>+    NS_ENSURE_SUCCESS(rv, rv);
>+
>+    txXPathTreeWalker walker(node);
>+    if (walker.moveToNamedAttribute(mLocalName, mNsID)) {
>+        rv = nodes->append(walker.getCurrentPosition());
>+        NS_ENSURE_SUCCESS(rv, rv);
>+    }
>+    *aResult = nodes;
>+    NS_ADDREF(*aResult);

    NS_ADDREF(*aResult = nodes);

>+PRBool
>+txNamedAttributeStep::isSensitiveTo(Expr::ContextSensitivity aContext)

s/Expr:://

>Index: xpath/txNodeSetFunctionCall.cpp
>===================================================================

>+Expr::ResultType
>+NodeSetFunctionCall::getReturnType()
>+{
>+    switch (mType) {
>+        case COUNT:
>+        case LAST:
>+        case POSITION:
>+        {
>+            return NUMBER_RESULT;
>+        }
>+        case ID:
>+        {
>+            return NODESET_RESULT;
>+        }
>+        case LOCAL_NAME:
>+        case NAME:
>+        case NAMESPACE_URI:
>+        {
>+            return STRING_RESULT;
>+        }
>+    }
>+
>+    NS_NOTREACHED("how'd we get here?");
>+    return ANY_RESULT;
>+}

Static array and mType as index?

>+
>+PRBool
>+NodeSetFunctionCall::isSensitiveTo(Expr::ContextSensitivity aContext)

>Index: xpath/txNumberFunctionCall.cpp
>===================================================================

>+PRBool
>+NumberFunctionCall::isSensitiveTo(Expr::ContextSensitivity aContext)

s/Expr:://

>Index: xpath/txPathExpr.cpp
>===================================================================

> nsresult
> PathExpr::evaluate(txIEvalContext* aContext, txAExprResult** aResult)
> {
>     *aResult = nsnull;
> 
>-    nsRefPtr<txNodeSet> nodes;
>-    nsresult rv = aContext->recycler()->getNodeSet(aContext->getContextNode(),
>-                                                   getter_AddRefs(nodes));
>+    // Evaluate first step with current context

Add some comments explaining why we do all this for the first step.

>+    txListIterator iter(&expressions);
>+    PathExprItem* pxi = NS_STATIC_CAST(PathExprItem*, iter.next());
>+    nsRefPtr<txAExprResult> res;
>+    nsresult rv = pxi->expr->evaluate(aContext, getter_AddRefs(res));
>     NS_ENSURE_SUCCESS(rv, rv);
> 
>-    txListIterator iter(&expressions);
>-    PathExprItem* pxi;
>+    NS_ENSURE_TRUE(res->getResultType() == txAExprResult::NODESET,
>+                   NS_ERROR_XSLT_NODESET_EXPECTED);
>+
>+    nsRefPtr<txNodeSet> nodes = NS_STATIC_CAST(txNodeSet*,
>+                                               NS_STATIC_CAST(txAExprResult*,
>+                                                              res));

No need for an nsRefPtr (res is a strong ref).

>+Expr*
>+PathExpr::getSubExprAt(PRUint32 aPos)
>+{
>+    PathExprItem* pxi = NS_STATIC_CAST(PathExprItem*, expressions.get(aPos));
>+
>+    return pxi ? pxi->expr : nsnull;

You'll need to use pxi->expr.get().

>+PRBool
>+PathExpr::isSensitiveTo(Expr::ContextSensitivity aContext)

>+    Expr::ContextSensitivity context =
>+        aContext & ~(Expr::NODE_CONTEXT | Expr::NODESET_CONTEXT);
>+    if (!context) {

context == NO_CONTEXT

>Index: xpath/txPredicateList.cpp
>===================================================================

>+void
>+PredicateList::dropFirst()
>+{
>+    txListIterator iter(&predicates);
>+    iter.next();
>+    iter.remove();

I'd do predicates.remove(predicates.firstItem());

>+PRBool
>+PredicateList::isSensitiveTo(Expr::ContextSensitivity aContext)
>+{
>+    // We're creating a new node/nodeset so we can ignore those bits.
>+    Expr::ContextSensitivity context =
>+        aContext & ~(Expr::NODE_CONTEXT | Expr::NODESET_CONTEXT);
>+    if (!context) {

context == Expr::NO_CONTEXT

>Index: xpath/txPredicatedNodeTest.cpp
>===================================================================

>+PRBool
>+txPredicatedNodeTest::matches(const txXPathNode& aNode, txIMatchContext* aContext)

Long line

>Index: xpath/txRootExpr.cpp
>===================================================================

>+PRBool
>+RootExpr::isSensitiveTo(Expr::ContextSensitivity aContext)

s/Expr:://

>Index: xpath/txStringFunctionCall.cpp
>===================================================================

>+Expr::ResultType
>+StringFunctionCall::getReturnType()
>+{
>+    switch (mType) {
>+        case CONCAT:
>+        case NORMALIZE_SPACE:
>+        case STRING:
>+        case SUBSTRING:
>+        case SUBSTRING_AFTER:
>+        case SUBSTRING_BEFORE:
>+        case TRANSLATE:
>+        {
>+            return STRING_RESULT;
>+        }
>+        case CONTAINS:
>+        case STARTS_WITH:
>+        {
>+            return BOOLEAN_RESULT;
>+        }
>+        case STRING_LENGTH:
>+        {
>+            return NUMBER_RESULT;
>+        }
>+    }
>+
>+    NS_NOTREACHED("how'd we get here?");
>+    return ANY_RESULT;
>+}

Static array and mType as index?

>+PRBool
>+StringFunctionCall::isSensitiveTo(Expr::ContextSensitivity aContext)

s/Expr:://

>+{
>+    switch (mType) {
>+        case CONCAT:
>+        case CONTAINS:
>+        case STARTS_WITH:
>+        case SUBSTRING:
>+        case SUBSTRING_AFTER:
>+        case SUBSTRING_BEFORE:
>+        case TRANSLATE:
>+        {
>+            return argsSensitiveTo(aContext);
>+        }
>+        case NORMALIZE_SPACE:
>+        case STRING:
>+        case STRING_LENGTH:
>+        {
>+            if (params.isEmpty()) {
>+                return !!(aContext & NODE_CONTEXT);
>+            }
>+            return argsSensitiveTo(aContext);
>+        }
>+    }
>+
>+    NS_NOTREACHED("how'd we get here?");
>+    return PR_TRUE;
>+}

I'd just assert that mType is known and always do return argsSensitiveTo(aContext); at the end.

>Index: xpath/txVariableRefExpr.cpp
>===================================================================

>+PRBool
>+VariableRefExpr::isSensitiveTo(Expr::ContextSensitivity aContext)

s/Expr:://

>Index: xpath/txXFormsFunctionCall.cpp
>===================================================================

>+Expr::ResultType
>+XFormsFunctionCall::getReturnType()
>+{
>+    // It doesn't really matter what we return here, but it might
>+    // be a good idea to try to keep this as unoptimizable as possible
>+    return ANY_RESULT;

Maybe copy what I did in my patch?

>+PRBool
>+XFormsFunctionCall::isSensitiveTo(Expr::ContextSensitivity aContext)

s/Expr:://

>Index: xpath/txXPathOptimizer.cpp
>===================================================================

>+nsresult
>+txXPathOptimizer::optimize(Expr* aInExpr, Expr** aOutExpr)

>+    nsresult rv = NS_OK;

Declare rv where you use it: this is C++, not C.

>+        i++;

++i

>+    switch (aInExpr->getType()) {
>+        case Expr::LOCATIONSTEP_ATTRIBUTE_EXPR:
>+            return optimizeAttributeStep(aInExpr, aOutExpr);
>+
>+        case Expr::LOCATIONSTEP_OTHER_EXPR:
>+            return optimizeStep(aInExpr, aOutExpr);
>+
>+        default:
>+            break;

return NS_OK;

>Index: xpath/txXPathOptimizer.h
>===================================================================

>+class txXPathOptimizer

>+private:
>+    // Helpermethods for optimizing specific classes

Helper methods

>Index: xslt/txtxKeyFunctionCall.cpp
>===================================================================

>+GenerateIdFunctionCall::isSensitiveTo(Expr::ContextSensitivity aContext)

s/Expr:://

>Index: xslt/txKeyFunctionCall.cpp
>===================================================================

>+PRBool
>+txKeyFunctionCall::isSensitiveTo(Expr::ContextSensitivity aContext)

s/Expr:://

>+{
>+    return (aContext & DOCUMENT_CONTEXT) ||
>+           argsSensitiveTo(aContext);

Put this on one line

>Index: xslt/txPatternOptimizer.cpp
>===================================================================

>+nsresult
>+txPatternOptimizer::optimize(txPattern* aInPattern, txPattern** aOutPattern)

>+        i++;

++i

>+        i++;

++i

>Index: xslt/txPatternParser.cpp
>===================================================================

> txPattern* txPatternParser::createPattern(const nsAFlatString& aPattern,
>                                           txIParseContext* aContext)

>+    if (newPattern) {
>+      pattern = newPattern;
>+    }
>+    return pattern.forget();

    return newPattern ? newPattern : pattern.forget();

>Index: xslt/txXSLTPatterns.cpp
>===================================================================

>+txPattern*
>+txLocPathPattern::getSubPatternAt(PRUint32 aPos)
>+{
>+    Step* step = NS_STATIC_CAST(Step*, mSteps.get(aPos));
>+
>+    return step ? step->pattern : nsnull;

Need to use step->pattern.get()

>Index: xslt/txXSLTPatterns.h
>===================================================================

>+    enum Type {
>+        STEP_PATTERN,
>+        UNKNOWN_PATTERN

OTHER_PATTERN?

>+#define TX_IMPL_PATTERN_STUBS_NOEXPR(_class)                  \

NO_SUB_EXPR?

>+#define TX_IMPL_PATTERN_STUBS_NOPATTERN(_class)               \

NO_SUB_PATTERN?
> >+Expr::ResultType
> >+txLiteralExpr::getReturnType()
...
> Would you be willing to switch this to a static array, using
> mValue->getResultType() as index (see my alternative patch)?

It's a little brittle, but it's nice to save a few cycles I guess. Added a comment in txExprResult.h

> Add some comments explaining why we do all this for the first step.

Added:
    // We need to evaluate the first step with the current context since it
    // can depend on the context size and position. For example:
    // key('books', concat('book', position()))

> >+    nsRefPtr<txNodeSet> nodes = NS_STATIC_CAST(txNodeSet*,
> >+                                               NS_STATIC_CAST(txAExprResult*,
> >+                                                              res));
> 
> No need for an nsRefPtr (res is a strong ref).

|nodes| will point to other things then just what |res| points to.

> >+PredicateList::dropFirst()
...
> I'd do predicates.remove(predicates.firstItem());

firstItem() is private, used .get(0) instead. txList sucks.

> >+StringFunctionCall::isSensitiveTo(Expr::ContextSensitivity aContext)
...
> I'd just assert that mType is known and always do return
> argsSensitiveTo(aContext); at the end.

Didn't add assert, we don't do that anywhere else. The compiler should ensure typesafety.

> >Index: xpath/txXFormsFunctionCall.cpp
> Maybe copy what I did in my patch?

Lets not optimize xforms for now since that'll most likly break once we get the new extension system.

> >+    nsresult rv = NS_OK;
> 
> Declare rv where you use it: this is C++, not C.

I prefer to put rv in top scope since we always end up having to move it there anyway. Of course, if I didn't have to init it to NS_OK it would compile to the same thing ;)

> >+        i++;
> 
> ++i

Does it make a difference? (same further down)

> >+    switch (aInExpr->getType()) {
> >+        case Expr::LOCATIONSTEP_ATTRIBUTE_EXPR:
> >+            return optimizeAttributeStep(aInExpr, aOutExpr);
> >+
> >+        case Expr::LOCATIONSTEP_OTHER_EXPR:
> >+            return optimizeStep(aInExpr, aOutExpr);
> >+
> >+        default:
> >+            break;
> 
> return NS_OK;

Then I'll need an extra return to silence compiler warnings about the function ending without return.
Created attachment 208023 [details] [diff] [review]
address review comments
Attachment #205687 - Attachment is obsolete: true
Attachment #208023 - Flags: review?(peterv)
Attachment #205687 - Flags: review?(peterv)
(Assignee)

Comment 48

12 years ago
Created attachment 208077 [details] [diff] [review]
evaluateToX infrastructure v1
Attachment #208077 - Flags: review?(bugmail)
(Assignee)

Comment 49

12 years ago
Comment on attachment 208023 [details] [diff] [review]
address review comments

>Index: src/base/txList.cpp
>===================================================================

>+        i++;

There's probably no noticeable difference between ++i and i++ in these cases, except that you really do mean ++i (increment i) and not i++ (store i, increment it and return the stored value), so why write the latter?

>Index: src/base/txList.h
>===================================================================

>+    /**
>+     * Replaces the Object at the given index with a new Object.
>+     * If the given index is outside the list null is returned.
>+     * @param aIndex .

Add an explanation for aIndex.

>Index: src/xpath/txExpr.h
>===================================================================

>+    typedef PRUint16 ContextSensitivity;
>+    enum {
>+        NO_CONTEXT = 0x00,
>+        DOCUMENT_CONTEXT = 0x01,
>+        NODE_ONLY_CONTEXT = 0x02,
>+        NODE_CONTEXT = DOCUMENT_CONTEXT | NODE_ONLY_CONTEXT,
>+        POSITION_CONTEXT = 0x04,
>+        SIZE_CONTEXT = 0x08,
>+        NODESET_CONTEXT = POSITION_CONTEXT | SIZE_CONTEXT,
>+        VARIABLES_CONTEXT = 0x10,
>+        PRIVATE_CONTEXT = 0x20,
>+        UNKNOWN_CONTEXT = 0xFFFF

Either rename UNKOWN_ to ALL_ or ANY_ or drop it.

>Index: src/xpath/txExprParser.cpp
>===================================================================

>@@ -188,28 +189,38 @@ txExprParser::createExprInternal(const n

>+    if (newExpr) {
>+      expr = newExpr;
>+    }
>+    *aExpr = expr.forget();

*aExpr = newExpr ? newExpr : expr.forget(); ?

>Index: src/xslt/txPatternOptimizer.h
>===================================================================

>+    // Helpermethods for optimizing specific classes

Helper methods
Attachment #208023 - Flags: review?(peterv) → review+
Created attachment 209064 [details] [diff] [review]
Fix remaining comments

forwarding petervs r+
Attachment #145090 - Attachment is obsolete: true
Attachment #205253 - Attachment is obsolete: true
Attachment #208023 - Attachment is obsolete: true
Attachment #209064 - Flags: superreview?(jst)
Attachment #209064 - Flags: review+
Comment on attachment 209064 [details] [diff] [review]
Fix remaining comments

Grr.. i forgot to save after fixing the

*aExpr = newExpr ? newExpr : expr.forget();

comment, fixed locally.
Comment on attachment 209064 [details] [diff] [review]
Fix remaining comments

sr=jst
Attachment #209064 - Flags: superreview?(jst) → superreview+
Comment on attachment 209064 [details] [diff] [review]
Fix remaining comments

Checked in. Yay!!
Attachment #209064 - Attachment is obsolete: true
Comment on attachment 208077 [details] [diff] [review]
evaluateToX infrastructure v1

You havn't made any of the classes implement the new evaluateToX directly. I bet there are a few that could get faster implementaions already, at least txLiteralExpr. You want to do that separatly?

>Index: content/xslt/src/xpath/txBooleanResult.cpp
>+BooleanResult::stringValue(nsString& aResult)
>+{
>+    aResult.AppendASCII(value ? "true" : "false");
>+}

This might be worth a faster implementation that uses NS_LITERAL_STRING.

>Index: content/xslt/src/xpath/txExpr.h
...
>+    virtual nsresult evaluateToBool(txIEvalContext* aContext,
>+                                    PRBool& aResult)
>+    {
>+        nsRefPtr<txAExprResult> exprRes;
>+        nsresult rv = evaluate(aContext, getter_AddRefs(exprRes));
>+        NS_ENSURE_SUCCESS(rv, rv);
>+
>+        aResult = exprRes->booleanValue();
>+
>+        return NS_OK;
>+    }
>+
>+    virtual nsresult evaluateToString(txIEvalContext* aContext,
>+                                      nsString& aResult)
>+    {
>+        nsRefPtr<txAExprResult> exprRes;
>+        nsresult rv = evaluate(aContext, getter_AddRefs(exprRes));
>+        NS_ENSURE_SUCCESS(rv, rv);
>+
>+        exprRes->stringValue(aResult);
>+
>+        return NS_OK;
>+    }

It would probably be a good idea to stick these implementations inside a .cpp file. IIRC the linker on linux would produce multiple copies of these implementations otherwise.

It would be nice if evaluateToString was named such that it was more obvious that it's appending rather then setting... Can't think of a good name right now.

Why no evaluateToNumber?

>Index: content/xslt/src/xpath/txExprResult.h
>@@ -66,8 +66,9 @@ public:
>         RESULT_TREE_FRAGMENT
>     };
> 
>-    txAExprResult(txResultRecycler* aRecycler) : mRecycler(aRecycler) {}
>-    virtual ~txAExprResult() {};
>+    txAExprResult(txResultRecycler* aRecycler) : mRecycler(aRecycler)
>+    {
>+    }

I think you might want to keep the virtual dtor around. Don't remember if the default one will be virtual otherwise.

>Index: content/xslt/src/xpath/txStringFunctionCall.cpp
...
>@@ -174,11 +196,14 @@ StringFunctionCall::evaluate(txIEvalCont
>                 return NS_ERROR_XPATH_BAD_ARGUMENT_COUNT;
> 
>             nsAutoString resultStr;
>-            if (iter.hasNext())
>-                evaluateToString((Expr*)iter.next(), aContext, resultStr);
>-            else
>+            if (iter.hasNext()) {
>+                Expr* arg1Expr = NS_STATIC_CAST(Expr*, iter.next());
>+                arg1Expr->evaluateToString(aContext, resultStr);

This is issing an NS_ENSURE_SUCCESS

>@@ -241,58 +272,74 @@ StringFunctionCall::evaluate(txIEvalCont
...
> 
>-            PRInt32 idx = arg1.Find(arg2);
>-            if (idx == kNotFound) {
>+            nsAString::const_iterator start, end;
>+            arg1.BeginReading(start);
>+            arg1.EndReading(end);
>+            nsAString::const_iterator substringEnd(end);
>+            if (!FindInReadable(arg2, start, substringEnd)) {

This seems like the wrong type of iterator. nsString::const_char_iterator should be much faster.

>         case SUBSTRING_BEFORE:
>         {
>             if (!requireParams(2, 2, aContext))
>                 return NS_ERROR_XPATH_BAD_ARGUMENT_COUNT;
> 
>-            nsAutoString arg1, arg2;
>-            Expr* arg1Expr = (Expr*)iter.next();
>-            evaluateToString((Expr*)iter.next(), aContext, arg2);
>+            Expr* arg1Expr = NS_STATIC_CAST(Expr*, iter.next());
>+            Expr* arg2Expr = NS_STATIC_CAST(Expr*, iter.next());
>+
>+            nsAutoString arg2;
>+            arg2Expr->evaluateToString(aContext, arg2);

Missing NS_ENSURE_SUCCESS

>             if (arg2.IsEmpty()) {
>                 aContext->recycler()->getEmptyStringResult(aResult);
> 
>                 return NS_OK;
>             }
> 
>-            evaluateToString(arg1Expr, aContext, arg1);
>+            nsAutoString arg1;
>+            arg1Expr->evaluateToString(aContext, arg1);

Here too

>         case TRANSLATE:
>         {
>             if (!requireParams(3, 3, aContext))
>                 return NS_ERROR_XPATH_BAD_ARGUMENT_COUNT;
> 
>+            Expr* arg1Expr = NS_STATIC_CAST(Expr*, iter.next());
>+
>             nsAutoString src;
>-            evaluateToString((Expr*)iter.next(), aContext, src);
>+            arg1Expr->evaluateToString(aContext, src);

And here

>@@ -304,9 +351,13 @@ StringFunctionCall::evaluate(txIEvalCont
>             NS_ENSURE_SUCCESS(rv, rv);
> 
>             strRes->mValue.SetCapacity(src.Length());
>+
>+            Expr* arg2Expr = NS_STATIC_CAST(Expr*, iter.next());
>+            Expr* arg3Expr = NS_STATIC_CAST(Expr*, iter.next());
>+
>             nsAutoString oldChars, newChars;
>-            evaluateToString((Expr*)iter.next(), aContext, oldChars);
>-            evaluateToString((Expr*)iter.next(), aContext, newChars);
>+            arg2Expr->evaluateToString(aContext, oldChars);
>+            arg3Expr->evaluateToString(aContext, newChars);

And in both these

>Index: content/xslt/src/xpath/txXFormsFunctionCall.cpp

And in all callsites in this file. Which may not matter depending on when the extension stuff goes in.

>Index: content/xslt/src/xslt/txInstructions.cpp
...
>@@ -573,7 +565,7 @@ txLREAttribute::execute(txExecutionState
>                                    getter_AddRefs(exprRes));
>     NS_ENSURE_SUCCESS(rv, rv);
> 
>-    nsAString* value = exprRes->stringValuePointer();
>+    const nsString* value = exprRes->stringValuePointer();

Why wasn't this changed to use evaluateToString? I guess there are a lot of callsites that aren't converted?

Hmm.. tricky. In some cases using evaluateToString would be slower then the current code (specifically when the expression is a txLiteralExpr or a txVariableRef).

>Index: content/xslt/src/xslt/txXPathResultComparator.cpp
...
>+txResultStringComparator::createSortableValue(Expr *aExpr,
>+                                              txIEvalContext *aContext,
>+                                              txObject *&aResult)
>+{
>+    nsAutoPtr<StringValue> val(new StringValue);
>+    if (!val) {
>+        return NS_ERROR_OUT_OF_MEMORY;
>+    }
> 
> #ifdef TX_EXE
>-    aExprRes->stringValue(val->mStr);
>+    aExpr->evaluateToString(aContext, val->mStr);

Missing NS_ENSURE_SUCCESS

...
>     nsresult rv = mCollation->AllocateRawSortKey(nsICollation::kCollationCaseInSensitive,
>                                                  nsCaseKey,
>                                                  &val->mKey, 
>                                                  &val->mLength);
>     if (NS_FAILED(rv)) {
>         NS_ERROR("Failed to create raw sort key");
>-        delete val;
>-        return 0;
>+
>+        return rv;
>     }

Make this an NS_ENSURE_SUCCESS(rv, rv)

Like we talked about on irc it would be nice to get some perf numbers to make sure that the extra virtual calls (through the default implementation of evaluateToX) doesn't add more overhead then we expect to win. I'm not that worried though so if it's a lot of hassle i'm fine with this as is.

with the above addressed, r=me
Attachment #208077 - Flags: review?(bugmail) → review+
(Assignee)

Comment 55

12 years ago
Created attachment 209850 [details]
evaluateToX performance numbers

Not much difference, mostly faster.
Cool!
(Assignee)

Comment 57

12 years ago
Created attachment 214577 [details] [diff] [review]
evaluateToX infrastructure v1.1

Carrying forward r=sicking.
Assignee: bugmail → peterv
Attachment #208077 - Attachment is obsolete: true
Attachment #214577 - Flags: superreview?(jst)
Attachment #214577 - Flags: review+
Comment on attachment 214577 [details] [diff] [review]
evaluateToX infrastructure v1.1

sr=jst
Attachment #214577 - Flags: superreview?(jst) → superreview+
(Assignee)

Comment 59

12 years ago
Other optimizations will happen in separate bugs.
Status: ASSIGNED → RESOLVED
Last Resolved: 12 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.