Closed Bug 95779 Opened 23 years ago Closed 23 years ago

Don't cache XPath Expr and Pattern on string values

Categories

(Core :: XSLT, defect, P1)

defect

Tracking

()

VERIFIED FIXED
mozilla0.9.7

People

(Reporter: axel, Assigned: sicking)

References

Details

(Keywords: arch)

Attachments

(3 files)

Expr's and Pattern's should be cached per node, for two reasons:
String compares are expensive compared to pointers, and the same string
may evaluate to two different xpath expressions due to different namespace
mappings.

Here is a list of what to cache for XSLT:
[Expr]
apply-templates/@select
copy-of/@select
for-each/@select
if/@test
key/@use
number/@value
param/@select
sort/@select
value-of/@select
variable/@select
when/@test
with-param/@select
[Pattern]
key/@match
number/(@count|@from)
template/@match

key is already dealt with, it just needs to get rid of using the ProcessorState
hash.
template and param are top-level only, they don't show up during processing,
so they can be handled separately (like key is). Import Prec!

I propose using a hash for each localName, and cache the tupel of expr/patterns
per node. (That makes a struct of a Expr and two Patterns for one number node)
Separating the hashes is a good thing, cause then you don't search nodes that
can't match. And once you need the hash, you obviously know which hash to
look at.
Keywords: arch
Priority: -- → P2
note from bug 83651:
import precedence rulez
xsl:template
xsl:variable, xsl:param (just top level vars)

Axel
(Templates now store their @match expressions separatly)

Longterm i'd like us to store the expressions in the actual attributes. For
module that will be possible once attributes can store nsISupports, and for
standalone we can do whatever we want.

But for now:

I don't really think that one hash per elementtype is that neat since we'll
probobly end up with some bigassed switchblock that has a few lines for each
elementtype. And we only really accomplish the same type of speedup we would if
we used one hash and increased the number of buckets. The right fix for slow
hashes is IMHO to fix the hash, not use more of them.

I suggest that we have a separate hash for each attribute name and then define
some enum for each attribute name so that you can do:

expr = ps.getExpr(myIfElem, ProcessorState::SelectAttr);
and
pattern = ps.getPattern(myNumberElem, ProcessorState::FromAttr);

It should then be really simple to implement by just keeping a static array with
one hash for each attribute.
I should say that with "static" i mean that we do |Map exprHashes[4]| isntead 
of |txList exprHashes|
No longer blocks: 96410
This should basically work. There is a slight risk that it'll regress namespace 
resolution somewhat, but that should be fixed once we resolve namespaces during 
parsing (see comments in ProcessorState::get(Expr|Pattern)).

I also changed the loaderdocument for ProcessorState::retrieveDocument becuase 
i was a bit lazy. That can be fixed by adding another argument if we want, but 
I think we should have some discussion about what should be the loaderdocument 
and what possible securityrisks there are.

I also decided to stick with parsing stuff in the ProcessorState. I do agree 
that it's not how it ideally should be done, but I don't think it's worth the 
trouble to have it in XSLTProcessor. We have lot of things that should be moved 
from ProcessorState to XSLTProcessor, but IMHO we should leave that for after 
Transformiix1.0 since the work involved is much bigger then the gain.
+Expr* ExprParser::createPattern(const String& pattern)
should be
+Pattern* ExprParser::createPattern(const String& pattern)
(like in ExprParser.h)

String::isEmpty() has landed.

line lengths in Numbering.cpp

+    mExprHashes[0].setOwnership(Map::eOwnsItems);
+    mExprHashes[1].setOwnership(Map::eOwnsItems);
+    mExprHashes[2].setOwnership(Map::eOwnsItems);
+    mPatternHashes[0].setOwnership(Map::eOwnsItems);
+    mPatternHashes[1].setOwnership(Map::eOwnsItems);
use the enums.

Do we need getBaseURI as part of parse or eval context in bug 96410?

(Didn't apply and test, I'll guess this needs a new patch after all that clean
up that landed today)
I have the patch, so i guess it's only fair that I take the bug too
Assignee: peterv → sicking
Comment on attachment 53803 [details] [diff] [review]
address comments and sync with tip

>? transformiix.ilk
>? transformiix.pdb
>Index: source/xpath/ExprParser.cpp
>===================================================================
>RCS file: /cvsroot/mozilla/extensions/transformiix/source/xpath/ExprParser.cpp,v
>retrieving revision 1.19
>diff -u -r1.19 ExprParser.cpp
>--- ExprParser.cpp	2001/10/16 09:25:07	1.19
>+++ ExprParser.cpp	2001/10/16 18:21:22
>@@ -154,15 +154,16 @@
> 
> } //-- createAttributeValueTemplate
> 
>-Expr* ExprParser::createExpr(const String& pattern) {
>+Expr* ExprParser::createExpr(const String& pattern)
>+{
>     ExprLexer lexer(pattern);
>     return createExpr(lexer);
> } //-- createExpr
> 
>-Expr* ExprParser::createPatternExpr(const String& pattern) {
>+Pattern* ExprParser::createPattern(const String& pattern)
>+{
>     ExprLexer lexer(pattern);
>-    Expr* expr = createUnionExpr(lexer);
>-    return expr;
>+    return createUnionExpr(lexer);
> } //-- createPatternExpr
> 
>   //--------------------/
>Index: source/xpath/ExprParser.h
>===================================================================
>RCS file: /cvsroot/mozilla/extensions/transformiix/source/xpath/ExprParser.h,v
>retrieving revision 1.7
>diff -u -r1.7 ExprParser.h
>--- ExprParser.h	2001/07/03 00:36:57	1.7
>+++ ExprParser.h	2001/10/16 18:21:22
>@@ -51,8 +51,8 @@
>     **/
>     ~ExprParser();
> 
>-    Expr*          createExpr        (const String& pattern);
>-    Expr*          createPatternExpr (const String& pattern);
>+    Expr* createExpr(const String& pattern);
>+    Pattern* createPattern(const String& pattern);
> 
>     /**
>      * Creates an Attribute Value Template using the given value
>Index: source/xslt/Numbering.cpp
>===================================================================
>RCS file: /cvsroot/mozilla/extensions/transformiix/source/xslt/Numbering.cpp,v
>retrieving revision 1.5
>diff -u -r1.5 Numbering.cpp
>--- Numbering.cpp	2001/10/16 09:25:33	1.5
>+++ Numbering.cpp	2001/10/16 18:21:25
>@@ -42,7 +42,9 @@
>     String valueAttr = xslNumber->getAttribute(VALUE_ATTR);
>     //-- check for expr
>     if (!valueAttr.isEmpty()) {
>-        Expr* expr = ps->getExpr(valueAttr);
>+        Expr* expr = ps->getExpr(xslNumber, ProcessorState::ValueAttr);
>+        if (!expr)
>+            return;
>         nbrOfCounts = 1;
>         counts = new int[1];
>         ExprResult* result = expr->evaluate(context, ps);
>@@ -56,11 +58,15 @@
> 
>         String countAttr = xslNumber->getAttribute(COUNT_ATTR);
> 
>-        PatternExpr* countExpr = 0;
>+        Pattern* countPattern;
>+        MBool ownsPattern;
>+        
>         if (!countAttr.isEmpty()) {
>-            countExpr = ps->getPatternExpr(countAttr);
>+            countPattern = ps->getPattern(xslNumber, ProcessorState::CountAttr);
>+            ownsPattern = MB_FALSE;
>         }
>         else {
>+            // Actually, this code should probobly use NodeTests instead
>             switch(context->getNodeType()) {
>                 case Node::ATTRIBUTE_NODE:
>                     countAttr.append('@');
>@@ -83,8 +89,15 @@
>                     countAttr.append("node()[false()]"); //-- for now
>                     break;
>             }
>-            countExpr = ps->getPatternExpr(countAttr);
>+            ExprParser parser;
>+            countPattern = parser.createPattern(countAttr);
>+            ownsPattern = MB_TRUE;
Should we have a createPattern in ProcessorState for this?
I don't like more hashes, but still...

>+ProcessorState::ProcessorState()
>+{
>+    mSourceDocument = 0;
>+    xslDocument = 0;
>+    resultDocument = 0;
mXSLTDocument and mResultDocument?

>+        xmlDoc = xmlParser.getDocumentFromURI(docUrl, xslDocument, errMsg);
Right, I need to mail mstoltz about this.

The rest of it looks pretty sweet. We should get this in by 0.9.6. I'll re-read it tomorrow. I need sleep.
Oh, and I wonder whether we should have Element::hasAttribute. There's some
spots in this patch that could use it, and there's plenty more elsewhere I'm sure.
> >-            countExpr = ps->getPatternExpr(countAttr);
> >+            ExprParser parser;
> >+            countPattern = parser.createPattern(countAttr);
> >+            ownsPattern = MB_TRUE;
> Should we have a createPattern in ProcessorState for this?
> I don't like more hashes, but still...

What would we key it on though, node and string? Actually i don't really care 
about the xsl:number code. It's nonconforming and next to unusably slow.

> mXSLTDocument and mResultDocument?

Please don't make me do that. We have a gazillion variables that needs 
prefixing in ProcessorState.
Status: NEW → ASSIGNED
Priority: P2 → P1
Target Milestone: --- → mozilla0.9.6
Comment on attachment 53803 [details] [diff] [review]
address comments and sync with tip

>-Expr* ExprParser::createExpr(const String& pattern) {
>+Expr* ExprParser::createExpr(const String& pattern)
>-Expr* ExprParser::createPatternExpr(const String& pattern) {
>+Pattern* ExprParser::createPattern(const String& pattern)

pattern -> aPattern

>Index: source/xslt/Numbering.cpp

the first few lines of doNumbering look like a mess, could you whack them too?

>+            countPattern = ps->getPattern(xslNumber, ProcessorState::CountAttr);
give a newline here ;-)
and in Numbering::getAncestorsOrSelf, too, please

>Index: source/xslt/ProcessorState.cpp

> void ProcessorState::getNameSpaceURIFromPrefix(const String& prefix, String& nameSpaceURI) {
>+    if (mXPathParseContext)
>+        XMLDOMUtils::getNameSpace(prefix, mXPathParseContext, nameSpaceURI);
> } //-- getNameSpaceURI

Use lookupNamespace... directly

>Index: source/xslt/ProcessorState.h
>     List           mImportFrames;
txList?


>Index: source/xslt/XSLTProcessor.cpp
>+    mNodeExpr = parser.createExpr(node);
plug that leak
addressed all comments except:

> and in Numbering::getAncestorsOrSelf, too, please
IMHO it's not worth the trouble to whitespace cleanup code that doesn't work 
and will be rewritten anyway

> void ProcessorState::getNameSpaceURIFromPrefix(const String& prefix, ...
> Use lookupNamespace... directly
Pike and I agreed on irc not to do this since this function is going away 
anyway, and we don't have any good way of going from ID->URI

I also implemented Element::hasAttr and used it where appropriate. Had to add 
some uglyness to get hold of the select-atoms, but we can remove that once we 
have the static atomtables
Comment on attachment 55678 [details] [diff] [review]
ver 3

synch this with the landing of bug 105808
r=axel@pike.org
Attachment #55678 - Flags: review+
seems like we'll have to push this, too bad :(
Target Milestone: mozilla0.9.6 → mozilla0.9.7
Comment on attachment 55678 [details] [diff] [review]
ver 3

sr=jst
Attachment #55678 - Flags: superreview+
checked in...


woooohoooo!!
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
we didn't verify for a long time.
I really checked, so VERIFIED.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: