Closed Bug 206445 Opened 21 years ago Closed 21 years ago

ExprLexer, ExprParser should return nsresult on errors

Categories

(Core :: XSLT, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: axel, Assigned: axel)

References

(Blocks 1 open bug)

Details

(Keywords: arch)

Attachments

(1 file, 1 obsolete file)

'nuff said. Almost.
How granular should the nsresults be? Is NS_ERROR_XPATH_PARSE_FAILED detailed
enough or should we provide more messages, starting off from the lexer error 
codes in
http://lxr.mozilla.org/seamonkey/source/extensions/transformiix/source/xpath/ExprLexer.h#176

As this patch is going to whack both ExprParser and ExprLexer hard, should I do
the "just go for txExprLexer/Parser" dance? We can decide that after a first 
patch, if that comforts anybody.
I have a tree for this, taking.
Status: NEW → ASSIGNED
If you're including my fix to not create avt-objects when we only have one
sub-object, you might as change |new AttributeValueTemplate| to |new
StringFunctionCall(StringFunctionCall::CONCAT)|. That way we can get rid of the
AVT class entierly.
This patch does not change avts and patterns yet. If you want me to, I'll add
those. But they could safely go to another patch.
Maybe apart from the error numbers, as I'm tempted to remove 
NS_ERROR_XPATH_PARSE_FAILURE once I'm thru with avts and patterns.

But I'd like to get some comments on this patch, too, it's rather big already.
note, I removed the errorCode stuff and the enum from ExprLexer locally.
note, as bug 210528 showed, passing nsAutoPtr by value doesn't cut a slice of
bread, so I changed those to references. I changed the
nsAutoPtr<class> foo = new Class();
pattern to
nsAutoPtr<class> foo(new Class());
too, as the first doesn't compile on gcc3.3.
I fixed quite a few comments that I got on the various versions of that tree
that
linger around.

One is to move the nsAutoPtr to nsAutoPtr& in constructors. Another one is to
step back from using nsAutoPtr in signatures and using .forget() in the caller
to indicate passing of ownership.
This is mostly for the addFoo methods. I left the nsAutoPtr& signature for
ExprParser::createBinaryExpr as that is
1) private
2) hands ownership directly over to constructors.

I kept from over-prettyfying the code, to keep patchsize down. One thing in
this
direction is that I didn't do lexer -> aLexer. That'd more or less would take
out the diff completely. Same for a bunch of op -> mOp and friends in the 
Expr classes.
(There is one occasion in ExprParser::createPathExpr, where I didn't compact
the line as much as I could. I prefer
    if (tok->mType != Token::PARENT_OP &&
	tok->mType != Token::ANCESTOR_OP) {
over
    if (tok->mType != Token::PARENT_OP && tok->mType != Token::ANCESTOR_OP) {
as you can tell the two compares more easily. IMHO.)

I'm a little inconsistent about inlining constructors for Expr objects. Vote on

me.
Attachment #125841 - Attachment is obsolete: true
Attachment #131089 - Flags: superreview?(peterv)
Attachment #131089 - Flags: review?(bugmail)
Further comments:
I could prefix Lexer, ExprParser and Token with tx, if we want to.
I could remove the end-of-function comments (// -- doFoo), too.

Re the goto, yes goto is bad. But I kinda liked putting the null check and
the addToken stuff in one place. So there is no loop or switch to break; out of.
I need to stay in the loop to handle the following tokens. I'd like to keep
those gotos then, so that newToken == mLastItem is really an error indicator.
Comment on attachment 131089 [details] [diff] [review]
new attempt on this stuff

>Index: source/base/txError.h
>===================================================================

>+#define NS_ERROR_XPATH_BAD_BANG                        \
>+    NS_ERROR_GENERATE_FAILURE(NS_ERROR_MODULE_XSLT, 24)
>+
>+#define NS_ERROR_XPATH_ILLEGAL_CHAR          \

Line up the backslash.

>Index: source/xpath/ExprLexer.cpp
>===================================================================

>+PRBool
>+ExprLexer::nextIsOperatorToken(Token* aToken)

>+  if (aToken->mType >= Token::COMMA && 
>+      aToken->mType <= Token::UNION_OP) {
>+    return PR_FALSE;
>+  }
>+  return PR_TRUE;

return aToken->mType < Token::COMMA ||
       aToken->mType > Token::UNION_OP;

>+        NS_ConvertUCS2toUTF8 opUTF8(Substring(start, mPosition));

Use NS_ConvertUTF16toUTF8, also add a comment that we should be able to drop
this conversion once we have a non-copying EqualsUTF8 (haven't found a bug on
that yet :-().

>       case PERIOD:
>         // period can be .., .(DIGITS)+ or ., check next
>+        if (++mPosition == end) {
>+          newToken = new Token(mPosition - 1, Token::SELF_NODE);
>+          NS_ENSURE_TRUE(newToken, NS_ERROR_OUT_OF_MEMORY);
>+          addToken(newToken);
>+          return NS_OK;
>+        }
>+        if (isXPathDigit(*mPosition)) {
>+          start = mPosition-1;

Spaces around -

>       case COLON: // QNames are dealt above, must be axis ident
>-        if (++iter < size && pattern.CharAt(iter) == COLON &&
>-            prevToken->type == Token::CNAME) {
>-          prevToken->type = Token::AXIS_IDENTIFIER;
>-          ++iter;
>+        if (++mPosition < end && *mPosition == COLON &&
>+            prevToken->mType == Token::CNAME) {
>+          prevToken->mType = Token::AXIS_IDENTIFIER;
>+          ++mPosition;
>+          goto noToken;
>         }
>         else {

No else after goto.

>       case L_PAREN:
>+        if (prevToken->mType == Token::CNAME) {
>+          NS_ConvertUCS2toUTF8 utf8Value(prevToken->Value());

See above.

>Index: source/xpath/ExprParser.cpp
>===================================================================

>+nsresult
>+ExprParser::createFunctionCall(ExprLexer& lexer, txIParseContext* aContext,
>+                               Expr** aResult)

>         if (NS_FAILED(rv)) {
>-            return 0;
>+            if (namespaceID == kNameSpaceID_None) {
>+                // check fcp, bug XXX
>+                NS_ASSERTION(0, "unknown function call");

NS_ERROR

>+nsresult
>+ExprParser::createUnionExpr(ExprLexer& lexer, txIParseContext* aContext,
>+                            Expr** aResult)

>+    nsAutoPtr<UnionExpr> unionExpr(new UnionExpr());
>+    NS_ENSURE_TRUE(unionExpr, NS_ERROR_OUT_OF_MEMORY);

Add a newline here.

>+    rv = unionExpr->addExpr(expr.forget());
>+    NS_ENSURE_SUCCESS(rv, rv);
>+
>+    while (lexer.peek()->mType == Token::UNION_OP) {
>         lexer.nextToken(); //-- eat token
> 
>-        expr = createPathExpr(lexer, aContext);
>-        if (!expr) {
>-            delete unionExpr;
>-            return 0;
>-        }
>-        unionExpr->addExpr(expr);
>+        rv = createPathExpr(lexer, aContext, getter_Transfers(expr));
>+        NS_ENSURE_SUCCESS(rv, rv);

Add a newline here.

>+        rv = unionExpr->addExpr(expr.forget());
>+        NS_ENSURE_SUCCESS(rv, rv);
>+nsresult
>+ExprParser::parsePredicates(PredicateList* aPredicateList,
>+                            ExprLexer& lexer, txIParseContext* aContext)

>+        rv = createExpr(lexer, aContext, getter_Transfers(expr));
>+        NS_ENSURE_SUCCESS(rv, rv);

Add a newline here.

>+        rv = aPredicateList->add(expr.forget());
>+        NS_ENSURE_SUCCESS(rv, rv);

There's more cleanup to be done but we can do that later.
Attachment #131089 - Flags: superreview?(peterv) → superreview+
> Use NS_ConvertUTF16toUTF8, also add a comment that we should be able to drop
> this conversion once we have a non-copying EqualsUTF8 (haven't found a bug on
> that yet :-().

I'm not convinced that that would buy us anything. Comparing strings with unknown
length might cost more than the conversion into the nsAutoString.
(Equals() checks for the length first, which should be the easy way out most of 
the time, which doesn't work for a non-copying EqualsUTF8.)

Fixed the other comments locally.
Comment on attachment 131089 [details] [diff] [review]
new attempt on this stuff

>+     * @return nsresult indicating out of memory

I'm not sure I like that description, it sounds like any errorvalue indicates
OOM. Not a reviewrequirement though.

There are still indentation-issues in Expr.h (Same as in bug 210528?)

>Index: source/xpath/ExprLexer.cpp

>+ExprLexer::ExprLexer()
>+  : mCurrentItem(nsnull),
>+    mFirstItem(nsnull),
>+    mLastItem(nsnull),
>+    mTokenCount(0),
>+    mEndToken('\0', Token::END)

which token ctor does this use for mEndToken?

> ExprLexer::~ExprLexer()
> {
>   //-- delete tokens
>-  currentItem = firstItem;
>-  while (currentItem) {
>-    TokenListItem* temp = currentItem->next;
>-    delete currentItem->token;
>-    delete currentItem;
>-    currentItem = temp;
>+  mCurrentItem = mFirstItem;
>+  while (mCurrentItem) {
>+    Token* temp = mCurrentItem->mNext;
>+    delete mCurrentItem;
>+    mCurrentItem = temp;

Nit: it'll probably be faster if you iterate using a local variable since that
would probably be compiled into a register whereas this code will have to set
the member all the time.

>+    else if (isXPathDigit(*mPosition)) {
>+      start = mPosition;
>+      while (++mPosition < end && isXPathDigit(*mPosition)) {
>+        /* just go */;

Don't end comments with a ';'

>+      }
>+      if (mPosition < end && *mPosition == '.') {
>+        while (++mPosition < end && isXPathDigit(*mPosition)) {
>+          /* just go */;

Same here


>       case SPACE:
>       case TX_TAB:
>       case TX_CR:
>       case TX_LF:
>-        ++iter;
>+        ++mPosition;
>+        goto noToken;
>         break;

Eek!

>+        if (mPosition == end) {
>+          return NS_ERROR_XPATH_UNCLOSED_LITERAL;
>         }

You might want to set mPosition to start before returning to give a better
errorposition.


>+        newToken = new Token(start + 1, mPosition, Token::LITERAL);
>+        ++mPosition;

>       case PERIOD:
>         // period can be .., .(DIGITS)+ or ., check next
>-        if (++iter < size) {
>-          ch=pattern.CharAt(iter);
>-          if (isXPathDigit(ch)) {
>-            start=iter-1;
>-            while (++iter < size && 
>-                   isXPathDigit(pattern.CharAt(iter))) /* just go */;
>-            addToken(new Token(Substring(pattern, start, iter - start),
>-                               Token::NUMBER));
>-          }
>-          else if (ch==PERIOD) {
>-            addToken(new Token(Substring(pattern, ++iter - 2, 2),
>-                               Token::PARENT_NODE));
>+        if (++mPosition == end) {
>+          newToken = new Token(mPosition - 1, Token::SELF_NODE);
>+          NS_ENSURE_TRUE(newToken, NS_ERROR_OUT_OF_MEMORY);
>+          addToken(newToken);
>+          return NS_OK;
>+        }
>+        if (isXPathDigit(*mPosition)) {

Couldn't you drop the last three lines in the first |if| and change the second
|if| to |else if|? That would make the code more homogen.

>+        if (++mPosition < end && *mPosition == EQUAL) {
>+          ++mPosition;
>+          newToken = new Token(mPosition - 2, mPosition, Token::NOT_EQUAL_OP);
>+          break;
>         }
>-        break;
>+        // Error ! is not not()
>+        return NS_ERROR_XPATH_BAD_BANG;

Please invert the if-condition and make it an early-return rather then put the
break inside the |if| to follow how you do this elsewhere.

>       case R_ANGLE:
>-        if (++iter < size && pattern.CharAt(iter) == EQUAL) {
>-          addToken(new Token(Substring(pattern, ++iter - 2, 2),
>-                             Token::GREATER_OR_EQUAL_OP));
>+        if (++mPosition == end) {
>+          return NS_ERROR_XPATH_UNEXPECTED_END;
>+        }
>+        if (*mPosition == EQUAL) {
>+          ++mPosition;
>+          newToken = new Token(mPosition - 2, mPosition,
>+                               Token::GREATER_OR_EQUAL_OP);
>+        }

There's not really any need to return an error there. You can just change the
|if| to
|if (++mPosition < end && *mPosition == EQUAL)|.

Same for L_ANGLE.

>+    NS_ENSURE_TRUE(newToken, NS_ERROR_OUT_OF_MEMORY);
>+    NS_ENSURE_TRUE(newToken != mLastItem, NS_ERROR_FAILURE);

When can newToken == mLastItem?

>+    prevToken = newToken;
>+    addToken(newToken);
>+  noToken: ;

I don't like the goto at all. Just use a flag to only conditionally test
newToken for nullness and call addToken.

That's it for the lexer, i'll continue with the patch tomorrow.
> which token ctor does this use for mEndToken?
it's the char* one, so I prolly should use nsnull. As we never get the value
of the end token, those pointers may be whatever they like.

> There's not really any need to return an error there.
It doesn't hurt to error once we know there's an error. I need to check what 
"3 > " gives, though. That might cause trouble in the parser. From looking at
the code, createPathExpr on an empty lexer doesn't fail gracefully. Let alone
return the right error in the case of createBinaryExpr.

> When can newToken == mLastItem?
When some of the logic fails. I could use this as an indicator for noToken,
though. But that removes a redundancy from that code, which is good for catching
errors. I don't like the flag-stuff to get rid of the goto, btw.
> > There's not really any need to return an error there.
> It doesn't hurt to error once we know there's an error. I need to check what 
> "3 > " gives, though. That might cause trouble in the parser. From looking at
> the code, createPathExpr on an empty lexer doesn't fail gracefully. Let alone
> return the right error in the case of createBinaryExpr.

The parser does the right thing for both tokens from "3 >" and an empty lexer so
relying on the parser should work just fine.

> > When can newToken == mLastItem?
> When some of the logic fails. I could use this as an indicator for noToken,
> though. But that removes a redundancy from that code, which is good for
> catching errors.

Sounds like you should do an assertion rather then or in addition to an ENSURE?

> I don't like the flag-stuff to get rid of the goto, btw.
why? gotos are evil and it's easy to avoid it here.
No biggie on relying on the parser btw, its just a few lines of code
> The parser does the right thing for both tokens from "3 >" and an empty lexer so
> relying on the parser should work just fine.

Sadly enough, it doesn't. If my checks on the trunk do what I expect, I'll end up
with NS_ERROR_XPATH_NO_NODE_TYPE_TEST for "3 >".
I'll need to handle END_TOKEN right at a few places.
Comment on attachment 131089 [details] [diff] [review]
new attempt on this stuff

Is ExprLexer::mTokenCount used for anything?

>Index: source/xpath/ExprParser.cpp
>@@ -121,9 +121,11 @@
>             case '}':
>                 if (inExpr) {
>                     inExpr = MB_FALSE;
>-                    ExprLexer lexer(buffer);
>-                    Expr* expr = createExpr(lexer, aContext);
>-                    if (!expr) {
>+                    ExprLexer lexer;
>+                    lexer.parse(buffer);
>+                    Expr* expr;
>+                    nsresult rv = createExpr(lexer, aContext, &expr);
>+                    if (NS_FAILED(rv)) {

Shouldn't you check the returnvalue from the lexer?

>-Expr* ExprParser::createExpr(const nsAFlatString& aExpression,
>-                             txIParseContext* aContext)
>+nsresult
>+ExprParser::createExpr(const nsAFlatString& aExpression,
>+                       txIParseContext* aContext, Expr** aExpr)
> {
>-    ExprLexer lexer(aExpression);
>-    Expr* expr = createExpr(lexer, aContext);
>-    return expr;
>+    NS_ENSURE_ARG_POINTER(aExpr);

just assert (if anything) on aExpr

>+    if (!expr) {
>+        return NS_ERROR_OUT_OF_MEMORY;
>+    }
>+    *aResult = expr;
>+    return NS_OK;
> } //-- createBinaryExpr

please use NS_ENSURE_TRUE

>-        expr = createUnionExpr(lexer, aContext);
>-        if (!expr)
>-            break;
>+        rv = createUnionExpr(lexer, aContext, getter_Transfers(expr));
>+        NS_ENSURE_SUCCESS(rv, rv);

you need to break rather then return on failure

>+                nsAutoPtr<Expr> binary;
>+                while (!exprs.isEmpty() && precedence(tok) 
>+                       <= precedence(NS_STATIC_CAST(Token*, ops.peek()))) {
>+                    // can't use expr as result due to order of evaluation
>+                    nsAutoPtr<Expr> left(NS_STATIC_CAST(Expr*, exprs.pop()));
>+                    rv = createBinaryExpr(left, expr,
>+                                          NS_STATIC_CAST(Token*, ops.pop()),
>+                                          getter_Transfers(binary));
>+                    if (NS_FAILED(rv)) {
>+                        break;
>+                    }
>+                    expr = binary;
>                 }

nit: can be better written as:

while (...) {
    nsAutoPtr<Expr> left(NS_STATIC_CAST(Expr*, exprs.pop()));
    nsAutoPtr<Expr> right(expr);
    rv = createBinaryExpr(left, right,
			  NS_STATIC_CAST(Token*, ops.pop()),
			  getter_Transfers(expr));
}


>+    while (NS_SUCCEEDED(rv) && !exprs.isEmpty()) {
>+        nsAutoPtr<Expr> left(NS_STATIC_CAST(Expr*, exprs.pop()));
>+        nsAutoPtr<Expr> binary;
>+        rv = createBinaryExpr(left, expr, NS_STATIC_CAST(Token*, ops.pop()),
>+                              getter_Transfers(binary));
>+        expr = binary;
>     }

Same here.

>+ExprParser::createFunctionCall(ExprLexer& lexer, txIParseContext* aContext,
>+                               Expr** aResult)
> {
>-    FunctionCall* fnCall = 0;
>+    *aResult = nsnull;
>+
>+    nsAutoPtr<FunctionCall> fnCall;
> 
>     Token* tok = lexer.nextToken();
>-    if (tok->type != Token::FUNCTION_NAME) {
>-        //XXX ErrorReport: error in parser, please report on bugzilla.mozilla.org
>-        return 0;
>-    }
>+    NS_ENSURE_TRUE(tok->mType == Token::FUNCTION_NAME, NS_ERROR_UNEXPECTED);

an assertion should be enough.

>+    if (namespaceID == kNameSpaceID_None) {
>+        // XXX how to handle out-of-mem here? gperf'ed local hash?

One way is to create a |PRBool checkExtensionFunction| that defaults to false
and is set to true in an else after all the function-tests and an else after
the namespace test. That way you can check for OOM if it's false and do the
extension-thing if it's true. This is optional though since we didn't do that
right before.

>         if (NS_FAILED(rv)) {
>-            return 0;
>+            if (namespaceID == kNameSpaceID_None) {
>+                // check fcp, bug XXX
>+                NS_ASSERTION(0, "unknown function call");
>+                return rv;
>+            }
>+            // Don't throw parse error, just error on evaluation
>+            fnCall = new txErrorFunctionCall(lName, namespaceID);
>         }

You only want to do this for rv = NS_ERROR_XPATH_UNKNOWN_FUNCTION, for other
NS_FAILED(rv)s you should just propagate rv. Create the errorfunction for
unprefixed functions too, that is allowed by the spec and makes us do the right
thing in fcp.

>     // check that internal functions got created properly
>-    if (!fnCall) {
>-        // XXX ErrorReport: out of memory
>-        return 0;
>-    }
>+    NS_ENSURE_TRUE(fnCall, NS_ERROR_OUT_OF_MEMORY);

This check only tests the txErrorFunctionCall allocation afaict so just move it
up there.

>-                nodeTest = createNodeTypeTest(lexer);
>-                if (!nodeTest) {
>-                    return 0;
>-                }
>+                nsAutoPtr<txNodeTypeTest> typeTest;
>+                rv = createNodeTypeTest(lexer, getter_Transfers(typeTest));
>+                NS_ENSURE_SUCCESS(rv, rv);
>+                nodeTest = typeTest.forget();

optional: you could change the signature of createNodeTypeTest to 'return' a
txNodeTest so that you can assign into |nodeTest| directly. (That's more on par
with what we do elsewhere, just return the baseclass)

>@@ -713,94 +727,92 @@
>             break;
>         default:
>             lexer.pushBack();
>-            // XXX ErrorReport: unexpected token
>-            return 0;
>-    }
>-    if (!nodeTest) {
>-        //XXX out of memory
>-        return 0;
>+            NS_NOTREACHED("NodeTypeTest expected");
>+            return NS_ERROR_XPATH_NO_NODE_TYPE_TEST;

It's not an error on our end if we end up here so don't call NS_NOTREACHED

>Index: source/xpath/FilterExpr.cpp
>-FilterExpr::FilterExpr(Expr* expr) : PredicateList() {
>-    this->expr = expr;
>+FilterExpr::FilterExpr(nsAutoPtr<Expr>& aExpr)
>+  : PredicateList(),
>+    expr(aExpr)
>+{

You can remove the PrediacteList() thing entierly.

Any reason why this ctor is kept in the cpp file while you moved most others to
Expr.h? (I prefer to keep them in the .cpp file but I don't really care either
way, though consistency is nice). Same with LocationStep and RelationalExpr.

>Index: source/xpath/FunctionCall.cpp
>+nsresult
>+txErrorFunctionCall::getNameAtom(nsIAtom** aAtom)
>+{
>+    *aAtom = mLName;
>+    NS_IF_ADDREF(*aAtom);

is NS_ADDREF enough?

>Index: source/xpath/nsXPathEvaluator.cpp
>@@ -89,9 +89,10 @@
> 
>     nsCOMPtr<nsIDocument> doc = do_QueryReferent(mDocument);
>     ParseContextImpl pContext(aResolver, !doc || doc->IsCaseSensitive());
>-    Expr* expression = ExprParser::createExpr(PromiseFlatString(aExpression),
>-                                              &pContext);
>-    if (!expression)
>+    Expr* expression;
>+    nsresult rv = ExprParser::createExpr(PromiseFlatString(aExpression),
>+                                         &pContext, &expression);
>+    if (NS_FAILED(rv) || !expression)

Do you need the nullcheck for expression?

>Index: source/xslt/txStylesheetCompileHandlers.cpp
>-    aExpr = ExprParser::createExpr(attr->mValue, &aState);
>-    if (!aExpr && (aRequired || !aState.fcp())) {
>-        // XXX ErrorReport: XPath parse failure
>-        return NS_ERROR_XPATH_PARSE_FAILURE;
>+    rv = ExprParser::createExpr(attr->mValue, &aState,
>+                                getter_Transfers(aExpr));
>+    if (NS_SUCCEEDED(rv) && !aExpr && (aRequired || !aState.fcp())) {
>+        // XXX ErrorReport: empty required attr
>+        return NS_ERROR_XSLT_PARSE_FAILURE;

|NS_SUCCEEDED(rv) && !aExpr| seems wrong, shouldn't that be |NS_FAILED(rv)|?
The failure might not be because of an empty attribute but any other parser
error too.


I also think that we shouldn't throw specific errors for different unexpected
tokens that we run into (sorry, my fault for adding those comments in the first
place). The reason is that we can't get the error right anyway so it's better
to just say 'unexpected token'. For example, is there really a missing
endparenthesis in "(foo 1)*2" or a missing operator? So i think we should
collapse NS_ERROR_XPATH_PAREN_EXPECTED, NS_ERROR_XPATH_NO_NODE_TYPE_TEST and
NS_ERROR_XPATH_BRACKET_EXPECTED into NS_ERROR_XPATH_UNEXPECTED_TOKEN. That will
also take care of proper errorreporting for "3 >" (which now is treated
differently from "3 > " and "3 >=").

r=sicking with the above fixed (including the goto ;-))
Attachment #131089 - Flags: review?(bugmail) → review+
checked in, with bustage fix from Neil and whitespace fix from me.
I even added a standalone test app, txTestExpr, which generates bad expressions so
one can test error pos and message (well, rv)
Status: ASSIGNED → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: