Closed Bug 291827 Opened 19 years ago Closed 19 years ago

Expat's byte position is bogus if a chunk boundary falls before we block

Categories

(Core :: XML, defect)

x86
Linux
defect
Not set
critical

Tracking

()

RESOLVED FIXED

People

(Reporter: bzbarsky, Assigned: peterv)

References

Details

Attachments

(1 file)

Consider the following simple Perl CGI:

print STDOUT "<html xmlns='http://www.w3.org/1999/xhtml'>\n";
print STDOUT "<script type='text/javascript' src='data:text/j";
sleep(2);
print STDOUT "avascript,'/></bar>\n";
print STDOUT "</html>";

When we run this through expat (via nsExpatDriver::ParseBuffer), the return
value of XML_GetCurrentByteIndex when we block for the script points to after
the last byte.  In other words, at that point expat claims that all the data has
been consumed.

This is a problem, because in every single other case XML_GetCurrentByteIndex
points to the first byte of data that hasn't been notified on yet (via
callbacks).  In this case that should be the '<' of "</bar>".

The problem is that while XML_Parse adjusts parseEndByteIndex when we block,
XML_ParseBuffer does NOT do so.  In this case, since a data chunk boundary falls
inside a tag, we end up buffering up some bytes from the first chunk, then
appending the second chunk and calling XML_ParseBuffer.  This blocks, so we
reset parseEndPtr, but not parseEndByteIndex.  This causes
XML_GetCurrentByteIndex to mis-compute the number of bytes consumed thus far.

The simple fix would be to update parseEndByteIndex, and then also update
bufferEnd to point to where we stopped parsing.  That would make this case
behave like the XML_Parse case (make nsExpatDriver pass in the data again next
time it's called).  This does lead to extra copying of data around, though. 
Perhaps a better way to do this would be to remove the "pass in the data again"
code in nsExpatDriver and make expat save the after-we-blocked data in the
XML_Parse case.  Then we only need to update the parseEndByteIndex when we block
in XML_ParseBuffer and don't need to change bufferEnd.

Thoughts?  I'd really like to get this sorted out, since this blocks fixing bug
275564 (because XML_GetCurrentByteIndex returns garbage in some cases when
trying to fix that bug).
I wonder if Expat 1.95.8 resolves this (see the patch in bug 274777).
Maybe.  Want me to test that patch?
Boris, I combined your patch in bug 275564 with the patch in bug 274777 (Expat
1.95.8) and for this testcase, I get

consumed: 208, mBytesParsed: 0
aLength: 236, startOffset: 44
blocked
Byte pos: 104, '<script type='text/javascript' src='data:text/javascript,'/>'
return state: '<script type='text/javascript' src='data:text/javascript,'/>'
consumed: 212, mBytesParsed: 208
aLength: 28, startOffset: 0
error
return state: '<script type='text/javascript' src='data:text/javascript,'/></bar>'

The error page shows the correct position:

XML Parsing Error: mismatched tag. Expected: </html>.
Location: http://people.mozilla-europe.org/peterv/parser.pl
Line Number 2, Column 63:


<script type='text/javascript' src='data:text/javascript,'/></bar>
--------------------------------------------------------------^
That looks just dandy to me.
Peter, if you can possibly drive this and a fix for bug 275564 into 1.8, that
would be wonderful..
*** Bug 298258 has been marked as a duplicate of this bug. ***
Boris: I haven't been able to reproduce this. I put up the CGI at
http://people.mozilla-europe.org/peterv/parser.pl and added a breakpoint in
XML_ParseBuffer and I never hit the breakpoint.
Hmm.  I also don't hit the breakpoint there, but I do hit it on
http://landfill.mozilla.org/ryl/testXML2.cgi -- the source to that is:

print STDOUT "<html xmlns='http://www.w3.org/1999/xhtml'>\n";
print STDOUT "abc<script type='text/javascript' src='data:text/j";
sleep(2);
print STDOUT "avascript,'/>def</bar>\n";
print STDOUT "</html>";
Attached patch v1Splinter Review
Assignee: xml → peterv
Status: NEW → ASSIGNED
Comment on attachment 189465 [details] [diff] [review]
v1

I think this should do it for now. I'd like to avoid the
"pass-in-the-data-again", but it's a more complicated change that I'll do after
Expat 1.95.8 lands.
Attachment #189465 - Flags: superreview?(bzbarsky)
Attachment #189465 - Flags: review?(bzbarsky)
Attachment #189465 - Flags: superreview?(bzbarsky)
Attachment #189465 - Flags: superreview+
Attachment #189465 - Flags: review?(bzbarsky)
Attachment #189465 - Flags: review+
Attachment #189465 - Flags: approval1.8b4?
Attachment #189465 - Flags: approval1.8b4? → approval1.8b4+
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: