Closed Bug 287409 Opened 19 years ago Closed 19 years ago

loading xml documents from file input stream using nsIDOMParser::parseFromStream fails

Categories

(Core :: XML, defect, P2)

defect

Tracking

()

VERIFIED FIXED
mozilla1.8beta2

People

(Reporter: asqueella, Assigned: bzbarsky)

References

Details

(Keywords: testcase)

Attachments

(3 files)

Loading XML documents using nsIDOMParser from file input stream fails. The error
message is as follows:

XML Parsing Error: no element found
Location: [...]/test.xul
Line Number: 1, Column: 1

To reproduce, create a simple XML file named test.xml in your profile and open
the attached testcase. The testcase is a XUL file with this script:
----
// get profile directory
var file = Components.classes["@mozilla.org/file/directory_service;1"].
  getService(Components.interfaces.nsIProperties).
  get("ProfD", Components.interfaces.nsIFile);
file.append("test.xml");
   
var fileInputStream =
Components.classes["@mozilla.org/network/file-input-stream;1"].
  createInstance(Components.interfaces.nsIFileInputStream);

fileInputStream.init(file, 1, 0, 0);

var domParser = new DOMParser();
var doc = domParser.parseFromStream(fileInputStream, "UTF-8",
fileInputStream.available(), "text/xml");
if(doc.firstChild.firstChild)
  alert(doc.firstChild.firstChild.nodeValue);
----

As far as I can see, this happens because nsDOMParser::ParseFromStream() relies
on nsParser to parse the stream passed to it, and nsParser makes use of stream's
ReadSegments() method, that is not implemented for filestreams.

The strange message is because in nsDOMParser::ParseFromStream() the return
value of nsParser::OnDataAvailable() (which is the one that fails) is ignored.
Therefore, nsDOMParser thinks everything was parsed correctly, but the tree is
empty.

The workaround is to read the file into a string and call parseFromString.
Attached file testcase
oops. You first need to save the testcase to your disk.
Shaver, this sounds similar to what you ran into, no?
Attached file bz's last try
bz attempted to help me out a few weeks ago with the attached code.

{As I recall}
Seems to fail on windows, and work on linux.
Is this a regression? We switched to ReadSegments in bug 197114, more than a
year ago.
(In reply to comment #6)
> Is this a regression? We switched to ReadSegments in bug 197114, more than a
> year ago.

Isn't the issue here in
http://lxr.mozilla.org/seamonkey/source/parser/htmlparser/src/nsParser.cpp#2605,
unrelated to the external entity references from that bug?
OK.  So the reason that loading data from file:// currently works at all is that
nsFileChannel::AsyncOpen wraps the raw file stream up in async stream (using an
input transport), and that implements ReadSegments (it's just an nsPipe).

The DOMParser should probably wrap a buffered stream around the stream it's given.

That said, is it worth making our file streams buffered (eg via mmap() or
something)?
bz:  is this what was busting that buildconfig code?

If so, having this fixed somewhat soon would be very welcome, as I'd like to be
capturing buildconfig when reporter gets rolled out.
> That said, is it worth making our file streams buffered (eg via mmap() or
> something)?

A nsIInputStream implementation is not required to implement readSegments. 
Fixing nsFileInputStream as you suggest is not a solution to the real problem. 
The DOM parser should deal with lack of a readSegments implementation.  It is
fairly easy to write code that checks to see if readSegments is supported (just
implement a writer callback that consumes no data and have it set a boolean flag
so you can tell that it was tickled).
BTW, I regret that nsIInputStream was frozen without an isBuffered method to
avoid messy readSegments checking code such as what I suggested above.
> is this what was busting that buildconfig code?

Almost certainly.

> A nsIInputStream implementation is not required to implement readSegments. 

I realize that.  The real question is whether there would be any sort of
efficiency win...  I guess there wouldn't be if we usually wrap them up anyway...

> The DOM parser should deal with lack of a readSegments implementation.

One question is whether the DOMParser should deal or the nsParser should
deal....  The latter is what's really assuming all streams passed to
OnDataAvailable have a ReadSegments that works.  That's not part of the
nsIStreamListener interface definition as far as I can tell, so perhaps it
shouldn't be assuming that?
Attached patch FixSplinter Review
Fix DOMParser to use the new API darin added in bug 287750.

I decided it wasn't worth passing in knowledge about the string and byte
stream; just testing whether the stream is buffered is fast enough.
Attachment #178650 - Flags: superreview?(jst)
Attachment #178650 - Flags: review?(darin)
Comment on attachment 178650 [details] [diff] [review]
Fix

sr=jst
Attachment #178650 - Flags: superreview?(jst) → superreview+
Comment on attachment 178650 [details] [diff] [review]
Fix

nit: s/COMPtr/nsCOMPtr/ in the comment

r=darin
Attachment #178650 - Flags: review?(darin) → review+
Assignee: xml → bzbarsky
OS: Windows XP → All
Priority: -- → P2
Hardware: PC → All
Target Milestone: --- → mozilla1.8beta2
Fixed.
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
verified using 20050401 win Firefox build, thanks!
Status: RESOLVED → VERIFIED
*** Bug 296918 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: