Closed Bug 65986 Opened 24 years ago Closed 23 years ago

Need to implement the key() XSLT function

Categories

(Core :: XSLT, defect)

defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla0.9.2

People

(Reporter: peterv, Assigned: sicking)

References

()

Details

Attachments

(18 files)

18.76 KB, patch
Details | Diff | Splinter Review
7.45 KB, text/plain
Details
18.09 KB, patch
Details | Diff | Splinter Review
20.86 KB, patch
Details | Diff | Splinter Review
7.46 KB, text/plain
Details
21.60 KB, patch
Details | Diff | Splinter Review
7.16 KB, text/plain
Details
7.20 KB, text/plain
Details
20.93 KB, patch
Details | Diff | Splinter Review
7.34 KB, text/plain
Details
20.59 KB, patch
Details | Diff | Splinter Review
7.36 KB, text/plain
Details
21.97 KB, patch
Details | Diff | Splinter Review
7.53 KB, text/plain
Details
22.24 KB, patch
Details | Diff | Splinter Review
7.73 KB, text/plain
Details
22.26 KB, patch
Details | Diff | Splinter Review
7.74 KB, text/plain
Details
We need to implement the key() XSLT extension function
(http://www.w3.org/TR/xslt.html#function-key) in Transformiix.
Blocks: 63906
Target Milestone: --- → mozilla0.8
Hi, Max, Keith, Dan,

to implement xsl:key (a prerequisite) we have had a little discussion in
#transformiix.

The spec in 12.2 is very vague on what exactly is regarded "this node matches
the |match| attribute", is there some WorkingGroup internal understanding of 
this?
We guessed, that we have to iterate over the complete doc, and see if |match|
matches. Is that including text, attribute, comment or even namespace nodes? Or
is there a principal node type for this search?

Thanx

Axel
Axel: the spec just says "nodes," and I believe part of the design
considerations for the |key| functionality is to remedy the problem that in XML
only elements may have unique identifiers. I think what you can do is something
like: first create a simple table of <xsl:key>s indicated in the stylesheet,
then make a pass over the document and create a table of nodes with keys based
on the rules in the spec. When you encounter key() functions later, you have a
table to work with... Hopefully there's some intelligent way to handle the
initial pass -- I don't know what transformiix does in general here...

To directly answer the question, though, when they say "this node matches the
|match| attribute," that attribute is a path expression like any other. If
you're in the node set resulting from evaluation of that expression, you match
the attribute.
I don't think an initial pass is done at all. After reading the spec, I 
think that when the key() function is encountered the xsl:key actions are 
evaluated using the current context node and the given arguments to the key() 
function. The context node is important for evaluating the match pattern.

I am going to try and implement something this weekend for this. Hopefully we 
can use the Xalan tests for keys to make sure we are at least conforming to 
them. I don't know if they even have a test for this. 

--Keith
Hmmm...ok after reading it a couple more times I think Dan is right about an 
initial pass. I think the match attribute needs to be evaluated against all 
nodes in the document. 

Seems kind of nasty. I used to have to do this with the Id() function when it 
contained some of the functionality which Keys were introduced for. I suppose 
what we could do is once we enounter the first use of the key() function we 
process the entire document and create the table for the keys?

Any better suggestions?

--Keith
One possible way to ameliorate this slightly might be to do the pass lazily.
That is, I think we understand that to use key() you have to have created keys
for the whole source document, so you either have to do a whole pass of the
source document matching against the keys described, or a whole pass of the keys
described (which, if you have some sort of super-clever XPath implementation,
might be more efficient, but...). But you can avoid doing either of those if you
never use the key() function. So your implementation could lazily construct the
table of keys.

Not super-clever, but whatever. It'll help. Peter, are you taking this bug?
The keys defenately has to be build lazily since it's not possible to know what 
document to step through until the key() function is called. One <xsl:key> 
could even require searching through multiple documents.

All this is because of the syntax for referring keys to external documents. 
IMHO the syntax for making a key refer to a external document should be:

<xsl:key match="document('books.xml')/Books/Book" value="Name">

I would like to start working on this ASAP, is it ok if I take this bug?
just to clarify:
I would like to start working on *this bug* ASAP, is it ok if I take it?
ok, freaks, let's get serious on this.
I'm going to remove Max, he's getting enough spam.

The dumb me didn't read the spec. Sorry for that. I added the relevant URL to
the header.

Patterns:
Axes are only child or attribute (HEY, NO NAMESPACENODES)
no functions! (no document() for that case)
Context nodes are the matching node or one of it's ancestors.

More contraints in the link, but these are the ones we have been 
misunderstanding the most.

sicking, I'll check out the format-number stuff first, I'm afraid there are 
still some things there ;-)

Axel
Been looking at this again, we can have calls to the functions id() and key()
at the start of a pattern.
The structure looks so simplified, does this call for a new set of objects?
With hard wired pattern logic? (we can use a Expression, but we won't do 
any error handling then, plus waste some optimisations perhaps)

Axel
We need at least another version of the FunctionCall class that inherits 
PatternExpr instead of Expr.
Why?
OK, I still need to read further in the specs, and the code, but:

I see we have a ExprParser::createPattern, but to me this looked relaxing the
constraints on patterns quite a bit.
It looks like we could optimize the matching there, too.
At least at second glance (the first is way over, and as all can see, miserably
failed)

Keith?
Ok we don't need another FunctionCall class, but we need to make FunctionCall 
inherit from PatternExpr rather then Expr since functions can be part of a 
pattern as well as an expression. We could probobly make FunctionCall::matches 
always return MB_FALSE and then override it for the classes needed (id and key)
I fail to see when you'd ever call "matches" on a function call. I don't think 
it's necessary to inherit from PatterExpr. A FunctionCall is not a pattern. It's 
evaluated as an expression. Please provide more information on why you want to 
make this change, because I fail to see why you need to.

--Keith
as Axel found (see bug url) the following are valid patterns:

id('some-id')//Person
key('mozProjects','xslt')/Bugs/Bug[@num=65986]
It'll work already, you don't need to make it extend pattern. We already use 
document() in patterns. So it should work properly as is. When you create a 
pattern expression it's basically returns a UnionExpr, which is one or more 
PathExprs, etc.  

--Keith
There is also the problem of variables and key() calls not allowed in 
match="..." and use="...". I suggest we extend the ContextState class with the 
following functions:

ContextState::disableVaiables(MBool disable)
ContextState::disableFunction(String functionName, MBool disable)

and temporarily disables key() and all variables when the match="..." and 
use="..." are evaluated
Moving forward.
Target Milestone: mozilla0.8 → mozilla0.9
I've begun working on this...
First version of implementation. I had to change the Map class to only delete 
it's contained values and not it's keys. But that should cause any problems 
since it seems the class isn't used anywhere in the code. Ofcource I could 
extend the Map class so you can choose whether it should delete  it's 
keys+values/it's values/nothing. 

On request from Pike I haven't done anything about key() functioncalls and 
variable references in match="..." and use="..."
Keywords: review
I wrote the map for use in the event based API I have locally. Please do not 
change it. Perhaps we could add a flag to prevent deleting of keys, but don't 
change the default behavior.

three things are new in ver 2:
* Map::setObjectDeletion take a short instead of MBool
* keynames are validated to be QNames
* moved some logic from KeyFunctionCall::evaluate to XSLKey::getNodes
I just noticed a small bug in the Map-code patch. The lines that say

if ( deleteObjects == itemsOnly ) {
   tItem->item = 0;
   delete tItem->key;
}

should of course be:

if ( deleteObjects == itemsOnly ) {
   delete tItem->item;
   tItem->key = 0;
}

DOH!
Sicking has the patch, so handing to him. He needs an r (and eventually sr).
Assignee: peterv → sicking
I haven't looked at the patch yet, but comments on KeyFunctionCall.cpp so far:

const NodeSet EMPTY_NODESET;

Dangerous, IMHO. This is a global constant, what happens if any expr appends a 
node to that? Just return a newly created empty nodeset. (Do would expect the
caller to free the returned nodeset?)

Please get rid of the cout. Not only are there serious spelling errors in your
output, they are there for optim builds as well. Not good.

Fix the comments, ::evaluate has the standard comment, which is pretty useless.
XSLKey::XSLKey has parts of the comment for ::getNodes :-(.

More to come, as soon as I'm awake enough to really understand what we try to
do here ;-)

Axel
The EMPTY_NODESET is never returned from an ::evaluate. XSLKey::getNodes 
returns it when no matching key is found, KeyFunctionCall::evaluate always 
makes a new NodeSet and copies nodes into that. Also, XSLKey::getNodes returns 
a |const NodeSet*| so no function should ever be able to append a node to that.

cout removed (it was never supposed to be in there, sorry)

Comments fixed
Status: NEW → ASSIGNED
have to push this :(
will hopefully go in soon after 0.9.1 opening
Target Milestone: mozilla0.9 → mozilla0.9.1
Ver 3.1 only changes KeyFunctionCall::evaluate to do
|NodeSet* res = new NodeSet;| at the beginning and then |return res;| instead 
of |return new NodeSet;|
Pike: any chance of a review of this before 0.9.1? The patch has been around 
for almost three months now...
pushing...
Target Milestone: mozilla0.9.1 → mozilla0.9.2
reassigning to peterv for review
Assignee: sicking → peterv
Status: ASSIGNED → NEW
back to me
Assignee: peterv → sicking
Mostly nits.

+    xslKeys.setObjectDeletion(MB_TRUE);
keysAndItems instead of MB_TRUE?

+     * Adds the supplied xsl:key to the set of keys
+     * returns NULL if no such key exists
+    **/
+    XSLKey* getKey(String& keyName);
Er. I think that comment is wrong?

     DocumentFunctionCall(ProcessorState* ps, Document* xslDocument);
 
     /**
-     * Evaluates this Expr based on the given context node and processor state
+     * Evaluates a key() xslt-functioncall. First argument is name of key
+     * to use, second argument is value to look up.
I think you changed the wrong comment.

+     * Adds an match/use pair. Returns MB_FALSE if matchString or useString
... a match/use pair

+     * Indexes an document and adds it to the set of indexed documents
... a document

+     * Recursivly searches a node, it's attributes and it's subtree for
Recursively searches a node, its attributes and its subtree for

+     * the node matches it's values are added to the index.
its

+     * ProcessorState used to get parse the match-patterns and
... to parse ...

In KeyFunctionCall.cpp:

    if(!res) {
...
    if ( requireParams(2, 2, cs) ) {
...
        if (!key) {

Please be consistent.

    if (!map)
        map = addDocument(doc);
...
    if(!nodes) return &EMPTY_NODESET;

Same. I prefer
if (test)
    statement
but it's up to you.

 * Adds an match/use pair. Returns MB_FALSE if matchString or useString
... a match/use

 * Indexes an document and adds it to the set of indexed documents
Indexes a document

 * Recursivly searches a node, it's attributes and it's subtree for
Recursively

    //-- check if nodes attributes matches
check if node's attributes match

etc.

The code is good, please fix the spelling errors and the inconsistencies in
coding style. It would be nice to prefix the class names (as we were planning to
do) and I prefer prefix m for member vars and a for arguments, but seeing that
we nearly never use those in Transformiix code, it's not a showstopper for me.
grmbl, that last should be "txKeyFunctionCall.cpp ver 4"...

I've addressed all comments except these:

> +    xslKeys.setObjectDeletion(MB_TRUE);
> keysAndItems instead of MB_TRUE?

Nope, xslKeys is a NamedMap not a Map.

>     //-- check if nodes attributes matches
> check if node's attributes match

genetive s shouldn't be apostrophed I think. But I see now that there should be 
an 'the' in there. Added it in my tree.
Status: NEW → ASSIGNED
Had a quick look, last bits:

+     * The set of all avalible keys
available

+ *    -- added XSLKey class
txXSLKey. There's more comments where you haven't added the tx to XSLKey or
XSLKeyFunctionCall. Not critical to have but it would be nice :).

@@ -48,7 +53,6 @@
     DocumentFunctionCall(ProcessorState* ps, Document* xslDocument);
 
     /**
-     * Evaluates this Expr based on the given context node and processor state
Don't remove the comment for DocumentFunctionCall (though I agree it doesn't say
a lot)

There's some tabs left in txKeyFunctionCall.cpp.

r=peterv if you fix those.
all comments addressed. Will go hunt for sr and a
ouch, I have comments:
you made a enum out of objectDeletion, which is nice, but use the enum, and
not short as type. And the use of defines as values for the enum scares me.
we don't really need the MBool compat, do we?
mapObjectDeletion? well, I don't come up up with something better. The values
should be prefixed though, none -> eOwnsNone, eOwnsKeysAndItems, 
eOwnsItemsOnly. (I had a name conflict with my own writings in there already)

The {} for methods in txKeyFunctionCall.cpp start a new line in one method, but
don't in the others, I guess we should go for
foo:bla
{
}

Axel
I now have
    enum txMapOwnership
    {
        eOwnsNone,
        eOwnsKeysAndItems,
        eOwnsItemsOnly
    };
and 
    void  setObjectDeletion(txMapOwnership aOwnership)

Peter just reminded me of Keith's comment, Keith, do you *really* need the
interface being compatible to MBool? Or could you live with this? 
It looks like a good trade, strong type checking and code readability vs. a 
(few ?) modifications in a patch

Axel
The names in the enum below really should be changed to something that makes it
more ovious from looking at the names that they're enums, like
eMapObjectDeletion_None, ...

+    enum mapObjectDeletion {
+        none = MB_FALSE,
+        keysAndItems = MB_TRUE,
+        itemsOnly

In the new .cpp file:

        ListIterator* iter = params.iterator();

        String keyName;
        evaluateToString((Expr*)iter->next(), aContext, aCs, keyName);
        Expr* param = (Expr*) iter->next();
        delete iter;

could be written like this to avoid the possibility of incorrectly using the
deleted ListeIterator 'iter' further down in the file:

        String keyName;
        Expr* param;

        {
            ListIterator* iter = params.iterator();

            evaluateToString((Expr*)iter->next(), aContext, aCs, keyName);
            param = (Expr*) iter->next();
            delete iter;
        }

In txXSLKey::addKey() and ::addDocument():

    Key* key = new Key;
    key->matchPattern = mProcessorState->getPatternExpr(aMatchString);
 
No out-of-memory check here?

There's also a bunch of other new's that are missing the checks for out of memory.

Other than that, sr=jst
All comments adressed (both jsts and pikes)

We sloved the backwards complience problem and nameing problem by creating a 
new function, Map::setOwnership, and forwarding calls to Map::setObjectDeletion 
to that. I also removed Map::clear(doObjectDeletion) since it's a rather ugly 
function anyway.
The following things has changed since ver 5 (the one jst looked at)
1. All |new| calls has null checks
2. Comment changes
3. Whitespace fixes. Do { after class::function at new line.
4. The function Map::setObjectDeletion is depreptiated in favour of
   Map::setOwnership. Calls to setObjectDeletion is forwarded to setOwnership
5. The function Map::clear(MBool deleteObjects) is removed.
6. Removed static constructor by moving EMPTY_NODESET into txXSLKey class
7. Interators are allocated on stack so they are no longer |delete|ed
8. Renamed enum constants
9. Added code to set correct current-node/nodeset. The code is added in
   txXSLKey::testNode
r=me. Now that we have yet another feature, we have yet another xalan test 
depending on landing 75102 (xpath parser).

Axel
Keywords: review
addressed irc-comments from jst:

1. comment change in description of txXSLKey class
2. making Map::txMapOwnership enum values more flag-ish
sr=jst
r=peterv
checked in. Leaving this open to try to get on the 0.9.2 branch
After some pain this is actually checked in...
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
bitching buttons, verfication spam
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: