Closed Bug 323299 Opened 14 years ago Closed 14 years ago

Simplify nsExpatDriver

Categories

(Core :: XML, defect, P3)

defect

Tracking

()

RESOLVED FIXED
mozilla1.9alpha1

People

(Reporter: peterv, Assigned: peterv)

References

Details

Attachments

(1 file, 2 obsolete files)

I've found a way to simplify nsExpatDriver a bit. The key change is to make the scanner always point to the position that Expat will start parsing when we call it. I've also found a way to avoid the loop to look for the beginning of the last line (so we can store it for error reporting) by using XML_GetCurrentColumnNumber.

I've run this through a couple of tests (XSLT testsuite, testcase from bug 275564, ...) and everything seems to work fine. Of course, there are tricky timing issues that might influence this code, but I think I got everything working correctly.

The patch includes one change to Expat code that I'll try integrating upstream.
Attached patch v1 (obsolete) — Splinter Review
The logging code will not be very useful until bug 244478 is fixed.
Attachment #208375 - Flags: review?(bzbarsky)
Comment on attachment 208375 [details] [diff] [review]
v1

>Index: parser/htmlparser/src/nsExpatDriver.cpp
>+#ifdef PR_LOGGING
>+static PRLogModuleInfo *gExpatDriverLog = PR_NewLogModule("expatdriver");
>+#endif

Where is prlog.h being included?  This will never be defined otherwise, no?

>+nsExpatDriver::ParseBuffer(const PRUnichar *aBuffer,

This should assert our preconditions about aBuffer being null when we're blocked, I would think.  Something like:

NS_PRECONDITION(mInternalState != NS_ERROR_HTMLPARSER_BLOCK || !aBuffer,
                "Non-null aBuffer when resuming");

Can we also assert:

NS_PRECONDITION(mInternalState != NS_OK || aBuffer, "Must have buffer to
 		resume");

?  Or no?

>+    PRInt32 parserBytesBefore = XML_GetCurrentByteIndex(mExpatParser);
>+    NS_ASSERTION(parserBytesBefore >= 0, "Unexpected value.");

Remove the '.' at the end of the string?

>+nsExpatDriver::ConsumeToken(nsScanner& aScanner, PRBool& aFlushTokens)
>+        NS_ASSERTION(mInternalState == NS_ERROR_HTMLPARSER_STOPPARSING,
>+                     "Unexpected error.");

No need for the '.' here.

>+    if (mEOF) {
>+      // Make sure we set the position of the scanner to the end (even if
>+      // there's still data in Expat's buffer), since there won't be any more
>+      // data coming in.
>+      currentExpatPosition = end;
>+      mExpatBuffered = 0;

I'm not sure I follow this.  So if mEOF is true why are we not setting currentExpatPosition to the actual position?  Won't this screw us up later if there's an error somewhere in the buffered data and we need to show the erroneous line?

>Index: parser/htmlparser/src/nsExpatDriver.h
>+   * @param aLength the length of the buffer to pass to Expat (in number of
>+   *                PRUnichar's). May be 0.

"Must be 0 if aBuffer is null."

>+   * @param aConsumed the number of PRUnichar's that Expat consumed

Remove the apostrophe, please.  And make it clear this is an out param?  Eg add "[out]" after the name?  Also, does this include whatever text Expat buffered?  I'm assuming "no", but that should be documented.  We should also document that expat can buffer up text without consuming it?

>@@ -106,13 +117,13 @@ private:

>+  PRPackedBool     mEOF;

Document what that means?

>+  // The length of the data in Expat's buffer (in number of PRUnichar's).
>+  PRUint32         mExpatBuffered;

Remove apostrophe after "PRUnichar".

The rest looks great.  The mEOF thing is the only thing that's keeping me from putting down r=bzbarsky.
(In reply to comment #2)
> Where is prlog.h being included?  This will never be defined otherwise, no?

nsExpatDriver.cpp includes nsExpatDriver.h which includes expat_config.h which includes nspr.h which includes prlog.h.

> Can we also assert:
> 
> NS_PRECONDITION(mInternalState != NS_OK || aBuffer, "Must have buffer to
>                 resume");

No, one can pass in a null buffer even when not resuming. Expat will then try to parse the rest of its internal buffer if isFinal is true and will simply return when isFinal is false.

We could decide to enforce (mInternalState != NS_OK || aIsFinal || aBuffer). Want me to do that?

> >+    if (mEOF) {
> >+      // Make sure we set the position of the scanner to the end (even if
> >+      // there's still data in Expat's buffer), since there won't be any more
> >+      // data coming in.
> >+      currentExpatPosition = end;
> >+      mExpatBuffered = 0;
> 
> I'm not sure I follow this.  So if mEOF is true why are we not setting
> currentExpatPosition to the actual position?  Won't this screw us up later if
> there's an error somewhere in the buffered data and we need to show the
> erroneous line?

We end up in this case when we know that we don't have any buffers left (and we've told Expat by passing it a null buffer and PR_TRUE for isFinal), Expat has parsed its remaining buffer and it hasn't blocked and it hasn't run into an error. I'm not sure it's actually possible to have any more data in Expat's buffer in that case since that would mean it either blocked, which we handle earlier (mInternalState == NS_ERROR_HTMLPARSER_BLOCK), or the data contains only the first part of a node, which would be an error and we handle earlier too (NS_FAILED(mInternalState)). If there's data left in this case, there's no way to get it out of there with more calls to Expat AFAIK.

So I could make this code be:

    if (mEOF) {
      NS_ASSERTION(mExpatBuffered == 0 && currentExpatPosition == end,
                   "Unreachable data left in Expat's buffer");
      break;
    }

It's a little scarier but I guess the assertion will let us know that something went wrong :-).

> Also, does this include whatever text Expat buffered?

No.

> I'm assuming "no", but that should be documented.  We should also document that
> expat can buffer up text without consuming it?

Sure.
Status: NEW → ASSIGNED
Priority: -- → P3
Comment on attachment 208375 [details] [diff] [review]
v1

> We could decide to enforce (mInternalState != NS_OK || aIsFinal || aBuffer).
> Want me to do that?

If it's really easy, sure.  Otherwise, don't bother.  Probably not worth it.

> So I could make this code be:

Let's do that.

r=bzbarsky
Attachment #208375 - Flags: review?(bzbarsky) → review+
Attached patch v1.1 (obsolete) — Splinter Review
Attachment #209084 - Flags: superreview?(jst)
Attachment #209084 - Flags: review+
Comment on attachment 209084 [details] [diff] [review]
v1.1

sr=jst
Attachment #209084 - Flags: superreview?(jst) → superreview+
I checked this in. Btek's Tp has gone up a lot, which has me baffled because afaik Tp doesn't include XML documents. I'll see if it goes down again after a while.
Setting a breakpoint in nsExpatDriver::ParseBuffer and running Tp shows that we hit it from RDFXMLDataSourceImpl::BlockingParse (called by UpdateInternetSearchResults() in the Seamonkey chrome) when the test starts.  But after that we're indeed not hitting it...

And the Tp went up more or less across the board (some of the pages are not affected).  So I really don't know what's going on either.  :(  The Tp jump has certainly not gone back down, though...



Could this checkin have caused bug 324641?
Depends on: 324641
Backed this out again, Btek is back to normal too so that might have been because of bogus XML errors in chrome.
Attached patch v1.1Splinter Review
Needed to add !aScanner.IsIncremental() check in addition of checking for kEOF.
Attachment #208375 - Attachment is obsolete: true
Attachment #209084 - Attachment is obsolete: true
Attachment #209839 - Flags: superreview+
Attachment #209839 - Flags: review+
Checked in attachment 209839 [details] [diff] [review]. This time the Cairo tinderboxes stayed green and Btek's Tp didn't go up. Let's hope it sticks.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
FYI: Linux/x86_64 blackdeath Clbr went orange after your patch went in.
It hangs at profile creation:
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey-Ports/1138374780.16640.gz
WARNING: BOGUS code reached!, file /builds/buildhack/mozilla/widget/src/gtk2/nsWindow.cpp, line 578

Backing out
mozilla/parser/htmlparser/src/nsExpatDriver.cpp 3.93 and
mozilla/parser/htmlparser/src/nsExpatDriver.h   3.27

fixes the assertion
(Assertion was during ordinary Firefox startup)
Sorry, I pasted the wrong assertion:
###!!! ASSERTION: Infinite loop: can't advance a reading iterator beyond the end of a string: 'one_hop>0', file ../../../dist/include/htmlparser/nsScannerString.h, line 449

This code is being entered through the RDFXML parser.
What's the stack?
Also, what file is it parsing when this happens?
My initial suspect is extensions.rdf but I'm not sure... I'll re-land that checkin into my tree to find out more.
#9  0x00002aaaab47f63d in nsScannerIterator::advance (this=0x7fffffe94100,
    n=4294966057) at ../../../dist/include/htmlparser/nsScannerString.h:449
#10 0x00002aaaab48a56b in nsExpatDriver::ConsumeToken (this=0x5dd840,
    aScanner=@0x8cde80, aFlushTokens=@0x7fffffe94434)
    at /builds/buildhack/mozilla/parser/htmlparser/src/nsExpatDriver.cpp:964
#11 0x00002aaaab49c01a in nsParser::Tokenize (this=0x5fcdb0, aIsFinalChunk=0)
    at /builds/buildhack/mozilla/parser/htmlparser/src/nsParser.cpp:2413
#12 0x00002aaaab49f613 in nsParser::ResumeParse (this=0x5fcdb0,
    allowIteration=1, aIsFinalChunk=0, aCanInterrupt=1)
    at /builds/buildhack/mozilla/parser/htmlparser/src/nsParser.cpp:1639
#13 0x00002aaaab49cfd3 in nsParser::OnDataAvailable (this=0x5fcdb0,
    request=0x942820, aContext=0x0, pIStream=0x5d8448, sourceOffset=0,
    aLength=5295)
    at /builds/buildhack/mozilla/parser/htmlparser/src/nsParser.cpp:2277
#14 0x00002aaaab46b410 in RDFXMLDataSourceImpl::OnDataAvailable (
    this=0x5ffce0, request=0x942820, ctxt=0x0, inStr=0x5d8448, sourceOffset=0,
    count=5295)
    at /builds/buildhack/mozilla/rdf/base/src/nsRDFXMLDataSource.cpp:1101
#15 0x00002aaaab46d258 in RDFXMLDataSourceImpl::BlockingParse (this=0x5ffce0,
    aURL=0x848bc0, aConsumer=0x5ffd00)

"aURL" is localstore.rdf
Benjamin, could you apply this patch, turn on NSPR logging for expatdriver:5 and attach the resulting log? Sorry to bother you with this, but I can't reproduce this and I don't understand how we end up trying to advance by 4294966057.

It does look like a type mismatch somehow. We're passing a negated PRUint32 (set to the result of a function returning unsigned long) to a function taking prtdiff_t. It isn't exactly right (and I was planning to fix this at some point), but it should work I think so hopefully the log helps me understand my mistake :-).


Index: parser/htmlparser/src/nsExpatDriver.cpp
===================================================================
RCS file: /cvsroot/mozilla/parser/htmlparser/src/nsExpatDriver.cpp,v
retrieving revision 3.93
diff -u -p -r3.93 nsExpatDriver.cpp
--- parser/htmlparser/src/nsExpatDriver.cpp	27 Jan 2006 14:47:23 -0000	3.93
+++ parser/htmlparser/src/nsExpatDriver.cpp	27 Jan 2006 23:28:39 -0000
@@ -955,6 +955,12 @@ nsExpatDriver::ConsumeToken(nsScanner& a
 
       // The length of the last line that Expat has parsed.
       PRUint32 lastLineLength = XML_GetCurrentColumnNumber(mExpatParser);
+      PR_LOG(gExpatDriverLog, PR_LOG_DEBUG,
+             ("consumed: %u, length: %u, lastLineLength: %u, advance: %i "
+              "(should be lastLineLength: %lu, advance: %ld).",
+              consumed, length, lastLineLength, -lastLineLength,
+              XML_GetCurrentColumnNumber(mExpatParser),
+              -XML_GetCurrentColumnNumber(mExpatParser)));
 
       if (lastLineLength <= consumed) {
         // The length of the last line was less than what expat consumed, so
I can try that tomorrow. FWIW I am doing this on a 64-bit OS: is it possible that we're incorrectly aliasing a 32-bit and a 64-bit type?
(In reply to comment #23)
> I can try that tomorrow. FWIW I am doing this on a 64-bit OS: is it possible
> that we're incorrectly aliasing a 32-bit and a 64-bit type?

It is, see bug 324667.
Depends on: 324667
(In reply to comment #22)
> prtdiff_t. It isn't exactly right (and I was planning to fix this at some
> point), but it should work I think so hopefully the log helps me understand my
> mistake :-).

The log won't help here, since XML_GetCurrentColumnNumber returns a signed int. The problem is that the compiler is (apparently) generating code to negate the 32-bit value before generating the code to promote it to 64 bits, making it a large positive integer on the receiving end.
(In reply to comment #25)
 The log won't help here, since XML_GetCurrentColumnNumber returns a signed int.
> The problem is that the compiler is (apparently) generating code to negate the
> 32-bit value before generating the code to promote it to 64 bits, making it a
> large positive integer on the receiving end.

Ah, thanks for catching that! I kept looking at columnNumber (which is unsigned long) and missed that the API returns a signed value. That mismatch was fixed in Expat 2.0, which I should be landing soon.
Could this have caused bug 326741 ?  I don't see any other likely candidates in the regression range, but I don't know this code at all..
I think this caused a pretty serious regression. We no longer fail to parse documents that end too soon. For example, loading the url

data:text/xml,<foo>

"works" fine.
Sicking, I think that's basically the same as bug 326741.  But might be worth filing a separate bug to track it, just in case.
No, it's bug 325733.
Depends on: 325733
You need to log in before you can comment on or make changes to this bug.