Status

()

VERIFIED FIXED
16 years ago
16 years ago

People

(Reporter: sicking, Assigned: sicking)

Tracking

Trunk
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 2 obsolete attachments)

Right now we have three levels of hashes for keys: a |txExpandedNameMap| of
|Map|s of |NamedMap|s. This means that for each key-lookup we do three
hash-lookups. It'd be better to merge this into a single hash keyed on the keys
from the current hashes.

I have a patch in the works (compiles but not tested) that implements this. It
will make us do a single hash-lookup for the common case.

The patch also paves the way for keys in the compiled-stylesheets-world where we
can't store the indexed values at the same place as the keys themself.
Attachment #110761 - Flags: superreview?(peterv)
Attachment #110761 - Flags: review?(axel)

Comment 2

16 years ago
I need more info on this to review this anytime soon.
I totally forgot what keys did, and I don't really have the time to find out what
they do now, either. Sorry.
Key-results are keyed on three things:

1. Key name (an expanded name)
2. Context document
3. Key value (a string)

The context document is decided by the contextnode when the key() function is
called, The key-name and key-value is given as arguments to the function.

We used to store this as a txExpandedNameMap of txXSLKeys, every txXSLKey
contained a Map (keyed on context-document) of NamedMaps. The value stored in
the NamedMap was the resulting nodeset. I've change this so that we only have a
single hash containing all key-result-nodesets.

There is no change in how or when keyvalues are calculated and there is no
change in how we decide on what hash-keys to use to store/retrieve the nodesets.
The only change is in how the values are stored/retrived. Basically the
functions ProcessorState::getKey and txXSLKey::getNodes are merged into
ProcessorState::getKeyValue

I did also fix a bug where we could add the same node more then once to a
resulting nodeset, however that is just a very small part of the patch:
+        if (entry->mNodeSet.isEmpty() ||
+            entry->mNodeSet.get(entry->mNodeSet.size()-1) != aNode) {
+            entry->mNodeSet.append(aNode);
         }
-        nodeSet->append(aNode);


There is however no hurry with this patch since it doesn't fix anything major.
The only reason i did this patch was that the current way things are dont can't
be carried over to the compiled-stylesheets world, and I wanted a separate patch
for this.
->me
Assignee: peterv → bugmail

Comment 5

16 years ago
Comment on attachment 110761 [details] [diff] [review]
patch to fix

>Index: source/xslt/ProcessorState.cpp
<...>
>@@ -821,11 +846,76 @@
> /**
>  * Adds the supplied xsl:key to the set of keys
>  * returns NULL if no such key exists
>-**/
>-txXSLKey* ProcessorState::getKey(txExpandedName& keyName)
>-{
>-    return (txXSLKey*)mXslKeys.get(keyName);
>+ */
>+nsresult ProcessorState::getKeyValue(const txExpandedName& aKeyName,
>+                                     Document* aDocument,
>+                                     const nsAString& aKeyValue,
>+                                     PRBool aIndexIfNotFound,
>+                                     NodeSet** aResult)
>+{
>+    *aResult = nsnull;
>+    txKeyValueHashKey valueKey(aKeyName, aDocument, aKeyValue);
>+    txKeyValueHashEntry* valueEntry = 
>+        NS_STATIC_CAST(txKeyValueHashEntry *,
>+                       PL_DHashTableOperate(&mKeyValues,
>+                                            &valueKey,
>+                                            PL_DHASH_LOOKUP));
>+    if (PL_DHASH_ENTRY_IS_BUSY(valueEntry)) {
>+        *aResult = &valueEntry->mNodeSet;
>+        return NS_OK;
>+    }
>+
>+    // we didn't find a value. This could either mean that that key has no
>+    // nodes with that value or that the key hasn't been indexed using this
>+    // document.
>+
>+    if (!aIndexIfNotFound) {
>+        // if aIndexIfNotFound is set then the caller knows this key is
>+        // indexed, so don't bother investigating
>+        return NS_OK;
>+    }
>+
>+    txIndexedKeyHashKey indexKey(aKeyName, aDocument);
>+    txIndexedKeyHashEntry* indexEntry = 
>+        NS_STATIC_CAST(txIndexedKeyHashEntry *,
>+                       PL_DHashTableOperate(&mIndexedKeys,
>+                                            &indexKey,
>+                                            PL_DHASH_ADD));
>+    NS_ENSURE_TRUE(indexEntry, NS_ERROR_OUT_OF_MEMORY);
>+
>+    if (indexEntry->mIndexed) {

why not 
PL_DHASH_ENTRY_IS_BUSY(indexEntry)
and get rid of mIndexed?
>+        // The key was indexed and apparently didn't contain this value so
>+        // return null
>+
>+        return NS_OK;
>+    }
>+
>+    // The key needs to be indexed
>+    txXSLKey* xslKey = (txXSLKey*)mXslKeys.get(aKeyName);
>+    if (!xslKey) {
>+        // The key didn't exist, so bail.

bail with a better error code. At least INVALID_ARG, even better 
would be a NS_ERROR_XSLT_NO_SUCH_KEY or so.

<...>
>Index: source/xslt/functions/txKeyFunctionCall.cpp
<...>
>+                if (entry->mNodeSet.isEmpty() ||
>+                    entry->mNodeSet.get(entry->mNodeSet.size()-1) != aNode) {
>+                    entry->mNodeSet.append(aNode);
>                 }
>-                nodeSet->append(aNode);
>             }
didn't get this chunk and it's brother. What's the path to add one
node twice?

peterv said he's dreaming of nsDoubleHashtable, not sure if that's all
he will have. sr interruptus ;-)
NEw patch coming for this one?
Comment on attachment 110761 [details] [diff] [review]
patch to fix

yeah, i'll make one using nsDoubleHashtable.h
Attachment #110761 - Attachment is obsolete: true
Attachment #110761 - Flags: superreview?(peterv)
Attachment #110761 - Flags: review?(axel)
Created attachment 114186 [details] [diff] [review]
version 2

> why not 
> PL_DHASH_ENTRY_IS_BUSY(indexEntry)
> and get rid of mIndexed?

Since i'm using the Add-operation the entry will always be busy if the
operation succeeded.

> bail with a better error code. At least INVALID_ARG, even better 
> would be a NS_ERROR_XSLT_NO_SUCH_KEY or so.

Done

> didn't get this chunk and it's brother. What's the path to add one
> node twice?

When a single node has the same value more then once. There are two ways this
can happen:
1. When the use-expression returns a nodeset, and two of the nodes have the
same
   XPath-value For example
   <xsl:key match="foo" use="@bar | @baz"/>
   and
   <foo bar="a" baz="a"/> in the source document

2. When there is more then one <xsl:key> elements with match-patterns matching
   the same node and having the same use-value. For example
   <xsl:key match="*[@bar]" use="@bar"/>
   <xsl:key match="*[@baz]" use="@baz"/>
   and
   <foo bar="a" baz="a"/> in the source document

(case 1. and 2. can of course be combined)

> peterv said he's dreaming of nsDoubleHashtable, not sure if that's all
> he will have. sr interruptus ;-)

Done


I also moved the remaining key code from the ProcessorState into a class
implemented in txKeyFunctionCall. Partly because it's nice to have all key-code
on one place, partly to reduce the size of the up-comming compilation-patch
Attachment #114186 - Flags: superreview?(peterv)
Attachment #114186 - Flags: review?(axel)

Comment 9

16 years ago
Comment on attachment 114186 [details] [diff] [review]
version 2

Most comments are to adjust comments, some are #include foo.
The only real change is having the expanded name as member 
of txXSLKey. You change pretty all lines that involve that,
so no biggie to fix that issue as well. IMHO.

With the following, r=axel@pike.org

(Looks alot, really, just because I kept quite a bunch of 
context)

Index: source/xslt/ProcessorState.cpp
<...> see below
Index: source/xslt/ProcessorState.h
<...>
@@ -311,7 +312,9 @@
      * Returns the key with the supplied name
      * returns NULL if no such key exists
      */
-    txXSLKey* getKey(txExpandedName& keyName);
+    nsresult getKeyNodes(const txExpandedName& aKeyName, Document* aDocument,
+			  const nsAString& aKeyValue, PRBool aIndexIfNotFound,
+			  NodeSet** aResult);

inline this function?
Adjust the comment.

     /**
      * Adds a decimal format. Returns false if the format already exists
<...>
Index: source/xslt/txXSLTPatterns.cpp
===================================================================
RCS file:
/cvsroot/mozilla/extensions/transformiix/source/xslt/txXSLTPatterns.cpp,v
retrieving revision 1.9
diff -u -r1.9 txXSLTPatterns.cpp
--- source/xslt/txXSLTPatterns.cpp	19 Jan 2003 23:17:09 -0000	1.9
+++ source/xslt/txXSLTPatterns.cpp	12 Feb 2003 06:34:49 -0000
@@ -435,8 +435,8 @@
	 contextDoc = (Document*)aNode;
     else
	 contextDoc = aNode->getOwnerDocument();
-    txXSLKey* key = mProcessorState->getKey(mName);
-    const NodeSet* nodes = key->getNodes(mValue, contextDoc);
+    NodeSet* nodes = 0;
+    mProcessorState->getKeyNodes(mName, contextDoc, mValue, PR_TRUE, &nodes);
     if (!nodes || nodes->isEmpty())
	 return MB_FALSE;
     MBool isTrue = nodes->contains(aNode);

check the return value. (yes, we ain't using it yet, but let's
indicate that we should)
Not sure if it's worth to cache results here. Probably not anymore.
Darn multiple documents.

Index: source/xslt/functions/XSLTFunctions.h
<...>
@@ -34,10 +34,12 @@
 #include "Expr.h"
 #include "Map.h"
 #include "NodeSet.h"
+#include "XMLUtils.h"

 class NamedMap;
 class ProcessorState;
 class txPattern;
+class txKeyValueHashKey;

you just delete stuff below and add stuff here?
Can't see the reason.

 /**
  * The definition for the XSLT document() function
<...>
Index: source/xslt/functions/txKey.h
<...>
+
+#ifndef TRANSFRMX_TXKEY_H
+#define TRANSFRMX_TXKEY_H

TX, not TRANSFRMX.

+
+#include "pldhash.h"

nsDoubleHashTable is what you use. Include that.

+#include "dom.h"
not needed, included by XMLUtils
+#include "XMLUtils.h"
+
+class txKeyValueHashKey
+{
+public:
+    txKeyValueHashKey(const txExpandedName& aKeyName,
+		       Document* aDocument,
+		       const nsAString& aKeyValue)
+	 : mKeyName(aKeyName),
+	   mDocument(aDocument),
+	   mKeyValue(aKeyValue)
+    {
+    }
+
+    txExpandedName mKeyName;
+    Document* mDocument;
+    nsString mKeyValue;
+};
swap nsString and Document*, nsString is larger.

+
+struct txKeyValueHashEntry : public PLDHashEntryHdr
+{
+    txKeyValueHashEntry(const void* aKey)
+	 : mKey(*(const txKeyValueHashKey*)aKey)

NS_STATIC_CAST?

+    {
+    }
+
+    // @see nsDoubleHashtable.h
+    const void* GetKey();
+    PRBool MatchEntry(const void* aKey) const;
+    static PLDHashNumber HashKey(const void* aKey);
+    
+    txKeyValueHashKey mKey;
+    NodeSet mNodeSet;
+};

I don't think that we rely on the pointer to the nodeset being constant.
We hand out pointers to the nodeset, but do we use them past their bedtime.
pldhash shuffles entries around, so please add a comment to getKeyNodes()
to use the returned nodeset only up to the next call.

+
+DECL_DHASH_WRAPPER(txKeyValueHash, txKeyValueHashEntry, txKeyValueHashKey&);
+
+class txIndexedKeyHashKey
+{
+public:
+    txIndexedKeyHashKey(txExpandedName aKeyName,
+			 Document* aDocument)
+	 : mKeyName(aKeyName),
+	   mDocument(aDocument)
+    {
+    }
+
+    txExpandedName mKeyName;
+    Document* mDocument;
+};
+
+struct txIndexedKeyHashEntry : public PLDHashEntryHdr
+{
+    txIndexedKeyHashEntry(const void* aKey)
+	 : mKey(*(const txIndexedKeyHashKey*)aKey),

NS_STATIC_CAST?

+	   mIndexed(PR_FALSE)
+    {
+    }
+
+    // @see nsDoubleHashtable.h
+    const void* GetKey();
+    PRBool MatchEntry(const void* aKey) const;
+    static PLDHashNumber HashKey(const void* aKey);
+
+    txIndexedKeyHashKey mKey;
+    PRBool mIndexed;
+};
+
+DECL_DHASH_WRAPPER(txIndexedKeyHash, txIndexedKeyHashEntry,
+		    txIndexedKeyHashKey&);
+
+/*
+ * Class representing an <xsl:key>. Or in the case where several <xsl:key>s
+ * have the same name one object represents all <xsl:key>s with that name
+ */

Better comment. Don't have two cases what this class would store.
"Class holding all <xsl:key>s of a particular expanded name in the
stylesheet and all its includes and imports."
or so.

+class txXSLKey : public TxObject {
+    
+public:
+    ~txXSLKey();
+    
+    /**
+     * Adds a match/use pair. Returns MB_FALSE if matchString or useString
+     * can't be parsed.
+     * @param aMatch  match-pattern
+     * @param aUse    use-expression
+     * @return MB_FALSE if an error occured, MB_TRUE otherwise
+     */
+    MBool addKey(txPattern* aMatch, Expr* aUse);

errh. String or expressions? Comment again.

+
+    /**
+     * Indexes a document and adds it to the hash of key values
+     * @param aDocument     Document to index and add
+     * @param aKeyName      Name of this key
+     * @param aKeyValueHash Hash to add values to
+     * @param aPs	     ProcessorState to use for XPath evaluation
+     */
+    nsresult indexDocument(Document* aDocument,
+			    const txExpandedName& aKeyName,
+			    txKeyValueHash& aKeyValueHash, ProcessorState*
aPs);

The keyName should be a member of txXSLKey, and not an argument.
It's not like this txXSLKey should index or hash any other keyname than
the right one.

+
+private:
+    /**
+     * Recursively searches a node, its attributes and its subtree for
+     * nodes matching any of the keys match-patterns.
+     * @param aNode	     Node to search
+     * @param aKey	     Key to use when adding into the hash
+     * @param aKeyValueHash Hash to add values to
+     * @param aPs	     ProcessorState to use for XPath evaluation
+     */
+    nsresult indexTree(Node* aNode, txKeyValueHashKey& aKey,
+			txKeyValueHash& aKeyValueHash, ProcessorState* aPs);
+
+    /**
+     * Tests one node if it matches any of the keys match-patterns. If
+     * the node matches its values are added to the index.
+     * @param aNode	     Node to test
+     * @param aKey	     Key to use when adding into the hash
+     * @param aKeyValueHash Hash to add values to
+     * @param aPs	     ProcessorState to use for XPath evaluation
+     */
+    nsresult testNode(Node* aNode, txKeyValueHashKey& aKey,
+		       txKeyValueHash& aKeyValueHash, ProcessorState* aPs);
+
+    /**
+     * represents one match/use pair
+     */
+    struct Key {
+	 txPattern* matchPattern;
+	 Expr* useExpr;
+    };
+
+    /**
+     * List of all match/use pairs. The items as |Key|s
+     */
+    List mKeys;
+};
+
+
+class txKeyHash
+{
+public:
+    txKeyHash(txExpandedNameMap& aKeys)
+	 : mKeys(aKeys)
+    {
+    }
+    
+    nsresult init();
+
+    nsresult getKeyNodes(const txExpandedName& aKeyName,
+			  Document* aDocument,
+			  const nsAString& aKeyValue,
+			  PRBool aIndexIfNotFound,
+			  ProcessorState* aPs,
+			  NodeSet** aResult);
+
+private:
+    // Hash of all indexed key-values
+    txKeyValueHash mKeyValues;
+
+    // Hash showing which keys+documents has been indexed
+    txIndexedKeyHash mIndexedKeys;
+    
+    // Map of txXSLKeys
+    txExpandedNameMap& mKeys;
+};
+
+
+#endif //TRANSFRMX_TXKEY_H
<...>
Attachment #114186 - Flags: review?(axel) → review+
Comment on attachment 114186 [details] [diff] [review]
version 2

>Index: source/xslt/txXSLTPatterns.cpp
>===================================================================

>@@ -435,8 +435,8 @@
>         contextDoc = (Document*)aNode;
>     else
>         contextDoc = aNode->getOwnerDocument();
>-    txXSLKey* key = mProcessorState->getKey(mName);
>-    const NodeSet* nodes = key->getNodes(mValue, contextDoc);
>+    NodeSet* nodes = 0;

I'd be happier if this were const.

>+    mProcessorState->getKeyNodes(mName, contextDoc, mValue, PR_TRUE, &nodes);

>Index: source/xslt/functions/txKey.h
>===================================================================

>+#ifndef TRANSFRMX_TXKEY_H
>+#define TRANSFRMX_TXKEY_H

I know this is how Transformiix include guards are now, but I would eventually
move to txFoo_h__. If you do want to keep this, at least use TX_.

>+struct txKeyValueHashEntry : public PLDHashEntryHdr
>+{
>+    txKeyValueHashEntry(const void* aKey)
>+        : mKey(*(const txKeyValueHashKey*)aKey)

NS_STATIC_CAST(const txKeyValueHashKey*, aKey)

>+struct txIndexedKeyHashEntry : public PLDHashEntryHdr
>+{
>+    txIndexedKeyHashEntry(const void* aKey)
>+        : mKey(*(const txIndexedKeyHashKey*)aKey),

NS_STATIC_CAST(const txIndexedKeyHashKey*, aKey)

>+DECL_DHASH_WRAPPER(txIndexedKeyHash, txIndexedKeyHashEntry,
>+                   txIndexedKeyHashKey&);
>+
>+/*

Use /**

>+ * Class representing an <xsl:key>. Or in the case where several <xsl:key>s
>+ * have the same name one object represents all <xsl:key>s with that name

Period at the end of the sentence.

>+ */
>+class txXSLKey : public TxObject {

>+    MBool addKey(txPattern* aMatch, Expr* aUse);

Please use PRBool/PR_TRUE/PR_FALSE now.

>+    nsresult indexDocument(Document* aDocument,
>+                           const txExpandedName& aKeyName,
>+                           txKeyValueHash& aKeyValueHash, ProcessorState* aPs);

Long line. Want to run it through the jst sr simulacrum and fix anything it
whines about?

>Index: source/xslt/functions/txKeyFunctionCall.cpp
>===================================================================

>+nsresult
>+txKeyHash::getKeyNodes(const txExpandedName& aKeyName,

>+    // we didn't find a value. This could either mean that that key has no

Capitalize the first w

>+    // nodes with that value or that the key hasn't been indexed using this
>+    // document.
>+
>+    if (!aIndexIfNotFound) {
>+        // if aIndexIfNotFound is set then the caller knows this key is
>+        // indexed, so don't bother investigating

Capitalize first letter, period at the end. There's other places like this.

>+    // Now that the key is indexed we can get it's value

its

>+nsresult txXSLKey::testNode(Node* aNode, txKeyValueHashKey& aKey,
>+                            txKeyValueHash& aKeyValueHash, ProcessorState* aPs)
> {

>+                    aKey.mKeyValue.Assign(val);
>+                    txKeyValueHashEntry* entry = aKeyValueHash.AddEntry(aKey);
>+                    NS_ENSURE_TRUE(entry, NS_ERROR_OUT_OF_MEMORY);
>+
>+                    if (entry->mNodeSet.isEmpty() ||
>+                        entry->mNodeSet.get(entry->mNodeSet.size()-1) !=

Spaces around -

>+                                            aNode) {

Why the check for aNode?

>+
>+                aKey.mKeyValue.Assign(val);
>+                txKeyValueHashEntry* entry = aKeyValueHash.AddEntry(aKey);
>+                NS_ENSURE_TRUE(entry, NS_ERROR_OUT_OF_MEMORY);
>+
>+                if (entry->mNodeSet.isEmpty() ||
>+                    entry->mNodeSet.get(entry->mNodeSet.size()-1) != aNode) {

Spaces around -
Attachment #114186 - Flags: superreview?(peterv) → superreview+
Created attachment 115823 [details] [diff] [review]
fix comments from Axel and Peter
Attachment #114186 - Attachment is obsolete: true
> Why the check for aNode?

see comment 8

Status: NEW → ASSIGNED
checked in
Status: ASSIGNED → RESOLVED
Last Resolved: 16 years ago
Resolution: --- → FIXED

Comment 14

16 years ago
mass verifying
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.