Closed Bug 334304 Opened 18 years ago Closed 18 years ago

SAX parser needs better async interface

Categories

(Core :: XML, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: sayrer, Assigned: sayrer)

References

Details

Attachments

(2 files, 8 obsolete files)

The XMLReader should implement nsIStreamListener and provide a parseAsync method. Also, the current synchronous parsing implementation is broken, in that it never calls DidBuildModel.
Need to wait until a bustage fix in 315826 is in CVS.
Blocks: 325080
Attached patch change interface, change uuid (obsolete) — Splinter Review
Attachment #218685 - Attachment is obsolete: true
Attached patch Patch with parseAsync() (obsolete) — Splinter Review
I haven't found any docs on the right way to do this, but I think this is what beng wants. Also, I'm not clear on whether I need to check/update the channel status in parseFromStream (see comment in patch).
Attachment #218686 - Attachment is obsolete: true
Attachment #218715 - Flags: review?(bzbarsky)
Comment on attachment 218715 [details] [diff] [review]
Patch with parseAsync()

parseAsync needs more comments about how it's meant to be used. method comments should not be limited to param descriptions.

why nullcheck the listener in OnStartRequest but not in ODA/OnStopRequest?
Attachment #218715 - Attachment is obsolete: true
Attachment #218733 - Flags: superreview?(bzbarsky)
Attachment #218733 - Flags: review?
Attachment #218733 - Flags: approval-l10n?
Attachment #218715 - Flags: review?(bzbarsky)
Attachment #218733 - Flags: approval-l10n?
Attachment #218733 - Flags: review? → review?(cbiesinger)
(In reply to comment #4)
> (From update of attachment 218715 [details] [diff] [review] [edit])
> parseAsync needs more comments about how it's meant to be used. method comments
> should not be limited to param descriptions.

Docs added.

> 
> why nullcheck the listener in OnStartRequest but not in ODA/OnStopRequest?

To make sure parseAsync was called first. But, I guess it wouldn't hurt to check everywhere.

Blocks: 315826
Comment on attachment 218733 [details] [diff] [review]
make nsISAXXMLReader inherit nsIStreamListener, docs

>Index: xml/public/nsISAXXMLReader.idl

>+  /**
>+   * Begin an ansynchronous parse. This method initializes the
>+   * parser. It is then the caller's duty to call the
>+   * nsISAXXMLReader's nsIStreamListener methods to drive the parser.

This should probably document that once this is called the caller MUST call at least OnStartRequest and OnStopRequest, right?  At least as things stand.  Otherwise you leak.

Another option would be to init the parser in OnStartRequest (after saving the listener in a member in asyncParse()).  Since the nsIRequestObserver API requires OnStopRequest any time OnStartRequest is called, that should make sure you don't leak.

>Index: xml/src/nsSAXXMLReader.cpp
>@@ -433,50 +418,119 @@ nsSAXXMLReader::ParseFromStream(nsIInput

>+  rv = InitParser(nsnull);

OK.  So if that succeeds, we now have a reference cycle between |this| and |mStreamListener|, right?

>+  NS_ENSURE_STATE(parserChannel);

So this leaks.

>+  // XXX brettw had a method called SyncChannelStatus in here. 
>+  // I'm not sure it's needed in this case.

It's needed; see nsIRequestObserver and nsIStreamListener's language about exceptions thrown from OnStartRequest and OnDataAvailable canceling the channel with that status.

Now on trunk, nsInputStreamChannel is smart and if it's already canceled will ignore Cancel() calls.  You could make a similar change on 1.8 branch, I suppose, so you don't have to copy this SyncChannelStatus thing.  Or you could just Cancel() if rv is a failure and not worry about clobbering an existing error status if there is one.  But you do need to do something, both trunk and branch.  Please file a followup bug blocking bug 315826 on this.

>+  rv = mListener->OnStartRequest(parserChannel, nsnull);

You need to call GetStatus() here before going into the loop in case OnStartRequest returned NS_OK but cancelled the channel.

>+    rv = mListener->OnDataAvailable(parserChannel, nsnull,
>+                                    aStream, 0, available);

Again, if this returns failure you need to cancel the channel and in any case you need to check the channel's status.

See what nsDOMParser::ParseFromStream does, more or less.

>+    if (NS_FAILED(rv))
>+      break;

Why bother?  You can just go back to the loop condition here and it'll be false...

>+nsSAXXMLReader::ParseAsync(nsIRequestObserver *aObserver)

We probably need to set some members here.... For example, if someone calls ParseAsync and then calls ParseFromStream we probably want to throw.  That sort of thing.  If your interface doesn't say it's not allowed, it's fair game.  Even if it does say it's not allowed you may want to defend against it (but the call sequence I just described is allowed by your interface).

>+nsSAXXMLReader::InitParser(nsIRequestObserver *aObserver)

>+  mParser = parser;

So I have to ask.... Other than so that we can create a refcount loop, why do we need mParser?  I don't see us using it for anything; I think we should just remove the member (and make SetParser() a no-op).  Note that the only caller of nsIContentSink::SetParser is nsParser::SetContentSink and that it's perfectly fine to ignore the call -- lots of sinks do.

Not setting mParser would also fix the various leaks scattered through this method.

>+  mListener = do_QueryInterface(mParser, &rv);
>+  NS_ENSURE_SUCCESS(rv, rv);

Why bother with the NS_ENSURE_SUCCESS, when the next statement is a |return rv|?
(In reply to comment #6)
> To make sure parseAsync was called first. But, I guess it wouldn't hurt to
> check everywhere.

I don't understand why that'd help. If onStartRequest is called, onStopRequest will be called as well.

Attached patch bz's comments (obsolete) — Splinter Review
Thanks for the comments. I think I understand this better now. 

This patch clobbers the channel status, but I will file a follow-on bug as well.
Attachment #218733 - Attachment is obsolete: true
Attachment #218733 - Flags: superreview?(bzbarsky)
Attachment #218733 - Flags: review?(cbiesinger)
Attachment #218820 - Attachment is obsolete: true
Attachment #218822 - Flags: superreview?(bzbarsky)
Attachment #218822 - Flags: review?(cbiesinger)
Comment on attachment 218822 [details] [diff] [review]
set mParserObserver to nsnull after InitParser

>Index: xml/src/nsSAXXMLReader.cpp
>+    rv = mListener->OnDataAvailable(parserChannel, nsnull,
>+                                    aStream, 0, available);

Fwiw, passing 0 there is probably wrong after the first time through the loop; I'd check with biesi or darin... It's not a big deal either way for now, I suspect, since iirc the parser doesn't really use that arg.

>+nsSAXXMLReader::OnStopRequest(nsIRequest *aRequest, nsISupports *aContext,
>+  nsresult rv = mListener->OnStopRequest(aRequest, aContext, status);

You need to null-check mListener here -- if the alloaction in OnStartRequest failed, it'll be null.

>Index: xml/src/nsSAXXMLReader.h
>+  NS_IMETHOD
>+  SetParser(nsIParser* aParser)

I'd rather keep this in the C++.  It's not like it's getting inlined anyway.

sr=bzbarsky with the nits picked.  Looks excellent.
Attachment #218822 - Flags: superreview?(bzbarsky) → superreview+
Attachment #218822 - Attachment is obsolete: true
Attachment #218822 - Flags: review?(cbiesinger)
Attachment #218898 - Flags: review?(cbiesinger)
Comment on attachment 218898 [details] [diff] [review]
nits picked, OnDataAvailable offset

Should probably document that parseAsync must be called before any stream listener method is called.

+  rv = NS_NewInputStreamChannel(getter_AddRefs(parserChannel), mBaseURI,
+                                aStream, nsDependentCString(aContentType),
+                                nsnull);

please remove the |, nsnull| here.

you also don't need to check both rv and nullness.

why do some error returns set mListener to null while others don't?

+  nsresult status;
+  rv = mListener->OnStartRequest(parserChannel, nsnull);
+  if (NS_FAILED(rv))
+    parserChannel->Cancel(rv);
+  parserChannel->GetStatus(&status);

why not move the status declaration on the line before the GetStatus call? that makes it easier to see that the OnStartRequest call does not in fact change status.

+  mListener->OnStopRequest(parserChannel, nsnull, rv);

this should pass status, not rv, right?

+  NS_ENSURE_TRUE(mIsAsyncParse, NS_ERROR_FAILURE);

I'd use NS_ENSURE_STATE for those, but it doesn't really matter.




you should probably fix the W warning from http://www.beaufour.dk/jst-review/?patch=218898

r=biesi, but note that I almost only reviewed the necko interaction of this patch, not the parser interaction. I trust bz reviewed that.
Attachment #218898 - Flags: review?(cbiesinger) → review+
Attached patch address biesi's commments (obsolete) — Splinter Review
Attachment #218898 - Attachment is obsolete: true
Attachment #219172 - Attachment is obsolete: true
Checked in on trunk.
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Attached patch branch patchSplinter Review
Attachment #220775 - Flags: approval-branch-1.8.1?(peterv)
Attachment #220775 - Flags: approval-branch-1.8.1?(peterv) → approval-branch-1.8.1+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: