Closed Bug 187610 Opened 22 years ago Closed 22 years ago

use PLDHash for keys

Categories

(Core :: XSLT, defect)

defect
Not set
normal

Tracking

()

VERIFIED FIXED

People

(Reporter: sicking, Assigned: sicking)

References

Details

Attachments

(1 file, 2 obsolete files)

Right now we have three levels of hashes for keys: a |txExpandedNameMap| of |Map|s of |NamedMap|s. This means that for each key-lookup we do three hash-lookups. It'd be better to merge this into a single hash keyed on the keys from the current hashes. I have a patch in the works (compiles but not tested) that implements this. It will make us do a single hash-lookup for the common case. The patch also paves the way for keys in the compiled-stylesheets-world where we can't store the indexed values at the same place as the keys themself.
Attachment #110761 - Flags: superreview?(peterv)
Attachment #110761 - Flags: review?(axel)
I need more info on this to review this anytime soon. I totally forgot what keys did, and I don't really have the time to find out what they do now, either. Sorry.
Key-results are keyed on three things: 1. Key name (an expanded name) 2. Context document 3. Key value (a string) The context document is decided by the contextnode when the key() function is called, The key-name and key-value is given as arguments to the function. We used to store this as a txExpandedNameMap of txXSLKeys, every txXSLKey contained a Map (keyed on context-document) of NamedMaps. The value stored in the NamedMap was the resulting nodeset. I've change this so that we only have a single hash containing all key-result-nodesets. There is no change in how or when keyvalues are calculated and there is no change in how we decide on what hash-keys to use to store/retrieve the nodesets. The only change is in how the values are stored/retrived. Basically the functions ProcessorState::getKey and txXSLKey::getNodes are merged into ProcessorState::getKeyValue I did also fix a bug where we could add the same node more then once to a resulting nodeset, however that is just a very small part of the patch: + if (entry->mNodeSet.isEmpty() || + entry->mNodeSet.get(entry->mNodeSet.size()-1) != aNode) { + entry->mNodeSet.append(aNode); } - nodeSet->append(aNode); There is however no hurry with this patch since it doesn't fix anything major. The only reason i did this patch was that the current way things are dont can't be carried over to the compiled-stylesheets world, and I wanted a separate patch for this.
->me
Assignee: peterv → bugmail
Comment on attachment 110761 [details] [diff] [review] patch to fix >Index: source/xslt/ProcessorState.cpp <...> >@@ -821,11 +846,76 @@ > /** > * Adds the supplied xsl:key to the set of keys > * returns NULL if no such key exists >-**/ >-txXSLKey* ProcessorState::getKey(txExpandedName& keyName) >-{ >- return (txXSLKey*)mXslKeys.get(keyName); >+ */ >+nsresult ProcessorState::getKeyValue(const txExpandedName& aKeyName, >+ Document* aDocument, >+ const nsAString& aKeyValue, >+ PRBool aIndexIfNotFound, >+ NodeSet** aResult) >+{ >+ *aResult = nsnull; >+ txKeyValueHashKey valueKey(aKeyName, aDocument, aKeyValue); >+ txKeyValueHashEntry* valueEntry = >+ NS_STATIC_CAST(txKeyValueHashEntry *, >+ PL_DHashTableOperate(&mKeyValues, >+ &valueKey, >+ PL_DHASH_LOOKUP)); >+ if (PL_DHASH_ENTRY_IS_BUSY(valueEntry)) { >+ *aResult = &valueEntry->mNodeSet; >+ return NS_OK; >+ } >+ >+ // we didn't find a value. This could either mean that that key has no >+ // nodes with that value or that the key hasn't been indexed using this >+ // document. >+ >+ if (!aIndexIfNotFound) { >+ // if aIndexIfNotFound is set then the caller knows this key is >+ // indexed, so don't bother investigating >+ return NS_OK; >+ } >+ >+ txIndexedKeyHashKey indexKey(aKeyName, aDocument); >+ txIndexedKeyHashEntry* indexEntry = >+ NS_STATIC_CAST(txIndexedKeyHashEntry *, >+ PL_DHashTableOperate(&mIndexedKeys, >+ &indexKey, >+ PL_DHASH_ADD)); >+ NS_ENSURE_TRUE(indexEntry, NS_ERROR_OUT_OF_MEMORY); >+ >+ if (indexEntry->mIndexed) { why not PL_DHASH_ENTRY_IS_BUSY(indexEntry) and get rid of mIndexed? >+ // The key was indexed and apparently didn't contain this value so >+ // return null >+ >+ return NS_OK; >+ } >+ >+ // The key needs to be indexed >+ txXSLKey* xslKey = (txXSLKey*)mXslKeys.get(aKeyName); >+ if (!xslKey) { >+ // The key didn't exist, so bail. bail with a better error code. At least INVALID_ARG, even better would be a NS_ERROR_XSLT_NO_SUCH_KEY or so. <...> >Index: source/xslt/functions/txKeyFunctionCall.cpp <...> >+ if (entry->mNodeSet.isEmpty() || >+ entry->mNodeSet.get(entry->mNodeSet.size()-1) != aNode) { >+ entry->mNodeSet.append(aNode); > } >- nodeSet->append(aNode); > } didn't get this chunk and it's brother. What's the path to add one node twice? peterv said he's dreaming of nsDoubleHashtable, not sure if that's all he will have. sr interruptus ;-)
NEw patch coming for this one?
Comment on attachment 110761 [details] [diff] [review] patch to fix yeah, i'll make one using nsDoubleHashtable.h
Attachment #110761 - Attachment is obsolete: true
Attachment #110761 - Flags: superreview?(peterv)
Attachment #110761 - Flags: review?(axel)
Attached patch version 2 (obsolete) — Splinter Review
> why not > PL_DHASH_ENTRY_IS_BUSY(indexEntry) > and get rid of mIndexed? Since i'm using the Add-operation the entry will always be busy if the operation succeeded. > bail with a better error code. At least INVALID_ARG, even better > would be a NS_ERROR_XSLT_NO_SUCH_KEY or so. Done > didn't get this chunk and it's brother. What's the path to add one > node twice? When a single node has the same value more then once. There are two ways this can happen: 1. When the use-expression returns a nodeset, and two of the nodes have the same XPath-value For example <xsl:key match="foo" use="@bar | @baz"/> and <foo bar="a" baz="a"/> in the source document 2. When there is more then one <xsl:key> elements with match-patterns matching the same node and having the same use-value. For example <xsl:key match="*[@bar]" use="@bar"/> <xsl:key match="*[@baz]" use="@baz"/> and <foo bar="a" baz="a"/> in the source document (case 1. and 2. can of course be combined) > peterv said he's dreaming of nsDoubleHashtable, not sure if that's all > he will have. sr interruptus ;-) Done I also moved the remaining key code from the ProcessorState into a class implemented in txKeyFunctionCall. Partly because it's nice to have all key-code on one place, partly to reduce the size of the up-comming compilation-patch
Attachment #114186 - Flags: superreview?(peterv)
Attachment #114186 - Flags: review?(axel)
Comment on attachment 114186 [details] [diff] [review] version 2 Most comments are to adjust comments, some are #include foo. The only real change is having the expanded name as member of txXSLKey. You change pretty all lines that involve that, so no biggie to fix that issue as well. IMHO. With the following, r=axel@pike.org (Looks alot, really, just because I kept quite a bunch of context) Index: source/xslt/ProcessorState.cpp <...> see below Index: source/xslt/ProcessorState.h <...> @@ -311,7 +312,9 @@ * Returns the key with the supplied name * returns NULL if no such key exists */ - txXSLKey* getKey(txExpandedName& keyName); + nsresult getKeyNodes(const txExpandedName& aKeyName, Document* aDocument, + const nsAString& aKeyValue, PRBool aIndexIfNotFound, + NodeSet** aResult); inline this function? Adjust the comment. /** * Adds a decimal format. Returns false if the format already exists <...> Index: source/xslt/txXSLTPatterns.cpp =================================================================== RCS file: /cvsroot/mozilla/extensions/transformiix/source/xslt/txXSLTPatterns.cpp,v retrieving revision 1.9 diff -u -r1.9 txXSLTPatterns.cpp --- source/xslt/txXSLTPatterns.cpp 19 Jan 2003 23:17:09 -0000 1.9 +++ source/xslt/txXSLTPatterns.cpp 12 Feb 2003 06:34:49 -0000 @@ -435,8 +435,8 @@ contextDoc = (Document*)aNode; else contextDoc = aNode->getOwnerDocument(); - txXSLKey* key = mProcessorState->getKey(mName); - const NodeSet* nodes = key->getNodes(mValue, contextDoc); + NodeSet* nodes = 0; + mProcessorState->getKeyNodes(mName, contextDoc, mValue, PR_TRUE, &nodes); if (!nodes || nodes->isEmpty()) return MB_FALSE; MBool isTrue = nodes->contains(aNode); check the return value. (yes, we ain't using it yet, but let's indicate that we should) Not sure if it's worth to cache results here. Probably not anymore. Darn multiple documents. Index: source/xslt/functions/XSLTFunctions.h <...> @@ -34,10 +34,12 @@ #include "Expr.h" #include "Map.h" #include "NodeSet.h" +#include "XMLUtils.h" class NamedMap; class ProcessorState; class txPattern; +class txKeyValueHashKey; you just delete stuff below and add stuff here? Can't see the reason. /** * The definition for the XSLT document() function <...> Index: source/xslt/functions/txKey.h <...> + +#ifndef TRANSFRMX_TXKEY_H +#define TRANSFRMX_TXKEY_H TX, not TRANSFRMX. + +#include "pldhash.h" nsDoubleHashTable is what you use. Include that. +#include "dom.h" not needed, included by XMLUtils +#include "XMLUtils.h" + +class txKeyValueHashKey +{ +public: + txKeyValueHashKey(const txExpandedName& aKeyName, + Document* aDocument, + const nsAString& aKeyValue) + : mKeyName(aKeyName), + mDocument(aDocument), + mKeyValue(aKeyValue) + { + } + + txExpandedName mKeyName; + Document* mDocument; + nsString mKeyValue; +}; swap nsString and Document*, nsString is larger. + +struct txKeyValueHashEntry : public PLDHashEntryHdr +{ + txKeyValueHashEntry(const void* aKey) + : mKey(*(const txKeyValueHashKey*)aKey) NS_STATIC_CAST? + { + } + + // @see nsDoubleHashtable.h + const void* GetKey(); + PRBool MatchEntry(const void* aKey) const; + static PLDHashNumber HashKey(const void* aKey); + + txKeyValueHashKey mKey; + NodeSet mNodeSet; +}; I don't think that we rely on the pointer to the nodeset being constant. We hand out pointers to the nodeset, but do we use them past their bedtime. pldhash shuffles entries around, so please add a comment to getKeyNodes() to use the returned nodeset only up to the next call. + +DECL_DHASH_WRAPPER(txKeyValueHash, txKeyValueHashEntry, txKeyValueHashKey&); + +class txIndexedKeyHashKey +{ +public: + txIndexedKeyHashKey(txExpandedName aKeyName, + Document* aDocument) + : mKeyName(aKeyName), + mDocument(aDocument) + { + } + + txExpandedName mKeyName; + Document* mDocument; +}; + +struct txIndexedKeyHashEntry : public PLDHashEntryHdr +{ + txIndexedKeyHashEntry(const void* aKey) + : mKey(*(const txIndexedKeyHashKey*)aKey), NS_STATIC_CAST? + mIndexed(PR_FALSE) + { + } + + // @see nsDoubleHashtable.h + const void* GetKey(); + PRBool MatchEntry(const void* aKey) const; + static PLDHashNumber HashKey(const void* aKey); + + txIndexedKeyHashKey mKey; + PRBool mIndexed; +}; + +DECL_DHASH_WRAPPER(txIndexedKeyHash, txIndexedKeyHashEntry, + txIndexedKeyHashKey&); + +/* + * Class representing an <xsl:key>. Or in the case where several <xsl:key>s + * have the same name one object represents all <xsl:key>s with that name + */ Better comment. Don't have two cases what this class would store. "Class holding all <xsl:key>s of a particular expanded name in the stylesheet and all its includes and imports." or so. +class txXSLKey : public TxObject { + +public: + ~txXSLKey(); + + /** + * Adds a match/use pair. Returns MB_FALSE if matchString or useString + * can't be parsed. + * @param aMatch match-pattern + * @param aUse use-expression + * @return MB_FALSE if an error occured, MB_TRUE otherwise + */ + MBool addKey(txPattern* aMatch, Expr* aUse); errh. String or expressions? Comment again. + + /** + * Indexes a document and adds it to the hash of key values + * @param aDocument Document to index and add + * @param aKeyName Name of this key + * @param aKeyValueHash Hash to add values to + * @param aPs ProcessorState to use for XPath evaluation + */ + nsresult indexDocument(Document* aDocument, + const txExpandedName& aKeyName, + txKeyValueHash& aKeyValueHash, ProcessorState* aPs); The keyName should be a member of txXSLKey, and not an argument. It's not like this txXSLKey should index or hash any other keyname than the right one. + +private: + /** + * Recursively searches a node, its attributes and its subtree for + * nodes matching any of the keys match-patterns. + * @param aNode Node to search + * @param aKey Key to use when adding into the hash + * @param aKeyValueHash Hash to add values to + * @param aPs ProcessorState to use for XPath evaluation + */ + nsresult indexTree(Node* aNode, txKeyValueHashKey& aKey, + txKeyValueHash& aKeyValueHash, ProcessorState* aPs); + + /** + * Tests one node if it matches any of the keys match-patterns. If + * the node matches its values are added to the index. + * @param aNode Node to test + * @param aKey Key to use when adding into the hash + * @param aKeyValueHash Hash to add values to + * @param aPs ProcessorState to use for XPath evaluation + */ + nsresult testNode(Node* aNode, txKeyValueHashKey& aKey, + txKeyValueHash& aKeyValueHash, ProcessorState* aPs); + + /** + * represents one match/use pair + */ + struct Key { + txPattern* matchPattern; + Expr* useExpr; + }; + + /** + * List of all match/use pairs. The items as |Key|s + */ + List mKeys; +}; + + +class txKeyHash +{ +public: + txKeyHash(txExpandedNameMap& aKeys) + : mKeys(aKeys) + { + } + + nsresult init(); + + nsresult getKeyNodes(const txExpandedName& aKeyName, + Document* aDocument, + const nsAString& aKeyValue, + PRBool aIndexIfNotFound, + ProcessorState* aPs, + NodeSet** aResult); + +private: + // Hash of all indexed key-values + txKeyValueHash mKeyValues; + + // Hash showing which keys+documents has been indexed + txIndexedKeyHash mIndexedKeys; + + // Map of txXSLKeys + txExpandedNameMap& mKeys; +}; + + +#endif //TRANSFRMX_TXKEY_H <...>
Attachment #114186 - Flags: review?(axel) → review+
Comment on attachment 114186 [details] [diff] [review] version 2 >Index: source/xslt/txXSLTPatterns.cpp >=================================================================== >@@ -435,8 +435,8 @@ > contextDoc = (Document*)aNode; > else > contextDoc = aNode->getOwnerDocument(); >- txXSLKey* key = mProcessorState->getKey(mName); >- const NodeSet* nodes = key->getNodes(mValue, contextDoc); >+ NodeSet* nodes = 0; I'd be happier if this were const. >+ mProcessorState->getKeyNodes(mName, contextDoc, mValue, PR_TRUE, &nodes); >Index: source/xslt/functions/txKey.h >=================================================================== >+#ifndef TRANSFRMX_TXKEY_H >+#define TRANSFRMX_TXKEY_H I know this is how Transformiix include guards are now, but I would eventually move to txFoo_h__. If you do want to keep this, at least use TX_. >+struct txKeyValueHashEntry : public PLDHashEntryHdr >+{ >+ txKeyValueHashEntry(const void* aKey) >+ : mKey(*(const txKeyValueHashKey*)aKey) NS_STATIC_CAST(const txKeyValueHashKey*, aKey) >+struct txIndexedKeyHashEntry : public PLDHashEntryHdr >+{ >+ txIndexedKeyHashEntry(const void* aKey) >+ : mKey(*(const txIndexedKeyHashKey*)aKey), NS_STATIC_CAST(const txIndexedKeyHashKey*, aKey) >+DECL_DHASH_WRAPPER(txIndexedKeyHash, txIndexedKeyHashEntry, >+ txIndexedKeyHashKey&); >+ >+/* Use /** >+ * Class representing an <xsl:key>. Or in the case where several <xsl:key>s >+ * have the same name one object represents all <xsl:key>s with that name Period at the end of the sentence. >+ */ >+class txXSLKey : public TxObject { >+ MBool addKey(txPattern* aMatch, Expr* aUse); Please use PRBool/PR_TRUE/PR_FALSE now. >+ nsresult indexDocument(Document* aDocument, >+ const txExpandedName& aKeyName, >+ txKeyValueHash& aKeyValueHash, ProcessorState* aPs); Long line. Want to run it through the jst sr simulacrum and fix anything it whines about? >Index: source/xslt/functions/txKeyFunctionCall.cpp >=================================================================== >+nsresult >+txKeyHash::getKeyNodes(const txExpandedName& aKeyName, >+ // we didn't find a value. This could either mean that that key has no Capitalize the first w >+ // nodes with that value or that the key hasn't been indexed using this >+ // document. >+ >+ if (!aIndexIfNotFound) { >+ // if aIndexIfNotFound is set then the caller knows this key is >+ // indexed, so don't bother investigating Capitalize first letter, period at the end. There's other places like this. >+ // Now that the key is indexed we can get it's value its >+nsresult txXSLKey::testNode(Node* aNode, txKeyValueHashKey& aKey, >+ txKeyValueHash& aKeyValueHash, ProcessorState* aPs) > { >+ aKey.mKeyValue.Assign(val); >+ txKeyValueHashEntry* entry = aKeyValueHash.AddEntry(aKey); >+ NS_ENSURE_TRUE(entry, NS_ERROR_OUT_OF_MEMORY); >+ >+ if (entry->mNodeSet.isEmpty() || >+ entry->mNodeSet.get(entry->mNodeSet.size()-1) != Spaces around - >+ aNode) { Why the check for aNode? >+ >+ aKey.mKeyValue.Assign(val); >+ txKeyValueHashEntry* entry = aKeyValueHash.AddEntry(aKey); >+ NS_ENSURE_TRUE(entry, NS_ERROR_OUT_OF_MEMORY); >+ >+ if (entry->mNodeSet.isEmpty() || >+ entry->mNodeSet.get(entry->mNodeSet.size()-1) != aNode) { Spaces around -
Attachment #114186 - Flags: superreview?(peterv) → superreview+
> Why the check for aNode? see comment 8
Status: NEW → ASSIGNED
checked in
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
mass verifying
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: