Closed
Bug 291827
Opened 19 years ago
Closed 19 years ago
Expat's byte position is bogus if a chunk boundary falls before we block
Categories
(Core :: XML, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: bzbarsky, Assigned: peterv)
References
Details
Attachments
(1 file)
813 bytes,
patch
|
bzbarsky
:
review+
bzbarsky
:
superreview+
benjamin
:
approval1.8b4+
|
Details | Diff | Splinter Review |
Consider the following simple Perl CGI: print STDOUT "<html xmlns='http://www.w3.org/1999/xhtml'>\n"; print STDOUT "<script type='text/javascript' src='data:text/j"; sleep(2); print STDOUT "avascript,'/></bar>\n"; print STDOUT "</html>"; When we run this through expat (via nsExpatDriver::ParseBuffer), the return value of XML_GetCurrentByteIndex when we block for the script points to after the last byte. In other words, at that point expat claims that all the data has been consumed. This is a problem, because in every single other case XML_GetCurrentByteIndex points to the first byte of data that hasn't been notified on yet (via callbacks). In this case that should be the '<' of "</bar>". The problem is that while XML_Parse adjusts parseEndByteIndex when we block, XML_ParseBuffer does NOT do so. In this case, since a data chunk boundary falls inside a tag, we end up buffering up some bytes from the first chunk, then appending the second chunk and calling XML_ParseBuffer. This blocks, so we reset parseEndPtr, but not parseEndByteIndex. This causes XML_GetCurrentByteIndex to mis-compute the number of bytes consumed thus far. The simple fix would be to update parseEndByteIndex, and then also update bufferEnd to point to where we stopped parsing. That would make this case behave like the XML_Parse case (make nsExpatDriver pass in the data again next time it's called). This does lead to extra copying of data around, though. Perhaps a better way to do this would be to remove the "pass in the data again" code in nsExpatDriver and make expat save the after-we-blocked data in the XML_Parse case. Then we only need to update the parseEndByteIndex when we block in XML_ParseBuffer and don't need to change bufferEnd. Thoughts? I'd really like to get this sorted out, since this blocks fixing bug 275564 (because XML_GetCurrentByteIndex returns garbage in some cases when trying to fix that bug).
Assignee | ||
Comment 1•19 years ago
|
||
I wonder if Expat 1.95.8 resolves this (see the patch in bug 274777).
Reporter | ||
Comment 2•19 years ago
|
||
Maybe. Want me to test that patch?
Assignee | ||
Comment 3•19 years ago
|
||
Boris, I combined your patch in bug 275564 with the patch in bug 274777 (Expat 1.95.8) and for this testcase, I get consumed: 208, mBytesParsed: 0 aLength: 236, startOffset: 44 blocked Byte pos: 104, '<script type='text/javascript' src='data:text/javascript,'/>' return state: '<script type='text/javascript' src='data:text/javascript,'/>' consumed: 212, mBytesParsed: 208 aLength: 28, startOffset: 0 error return state: '<script type='text/javascript' src='data:text/javascript,'/></bar>' The error page shows the correct position: XML Parsing Error: mismatched tag. Expected: </html>. Location: http://people.mozilla-europe.org/peterv/parser.pl Line Number 2, Column 63: <script type='text/javascript' src='data:text/javascript,'/></bar> --------------------------------------------------------------^
Comment 4•19 years ago
|
||
That looks just dandy to me.
Reporter | ||
Comment 5•19 years ago
|
||
Peter, if you can possibly drive this and a fix for bug 275564 into 1.8, that would be wonderful..
Assignee | ||
Comment 6•19 years ago
|
||
*** Bug 298258 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 7•19 years ago
|
||
Boris: I haven't been able to reproduce this. I put up the CGI at http://people.mozilla-europe.org/peterv/parser.pl and added a breakpoint in XML_ParseBuffer and I never hit the breakpoint.
Reporter | ||
Comment 8•19 years ago
|
||
Hmm. I also don't hit the breakpoint there, but I do hit it on http://landfill.mozilla.org/ryl/testXML2.cgi -- the source to that is: print STDOUT "<html xmlns='http://www.w3.org/1999/xhtml'>\n"; print STDOUT "abc<script type='text/javascript' src='data:text/j"; sleep(2); print STDOUT "avascript,'/>def</bar>\n"; print STDOUT "</html>";
Assignee | ||
Comment 9•19 years ago
|
||
Assignee: xml → peterv
Status: NEW → ASSIGNED
Assignee | ||
Comment 10•19 years ago
|
||
Comment on attachment 189465 [details] [diff] [review] v1 I think this should do it for now. I'd like to avoid the "pass-in-the-data-again", but it's a more complicated change that I'll do after Expat 1.95.8 lands.
Attachment #189465 -
Flags: superreview?(bzbarsky)
Attachment #189465 -
Flags: review?(bzbarsky)
Reporter | ||
Updated•19 years ago
|
Attachment #189465 -
Flags: superreview?(bzbarsky)
Attachment #189465 -
Flags: superreview+
Attachment #189465 -
Flags: review?(bzbarsky)
Attachment #189465 -
Flags: review+
Assignee | ||
Updated•19 years ago
|
Attachment #189465 -
Flags: approval1.8b4?
Updated•19 years ago
|
Attachment #189465 -
Flags: approval1.8b4? → approval1.8b4+
Assignee | ||
Updated•19 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•