Open Bug 658497 Opened 13 years ago Updated 2 years ago

XMLHttpRequest throws 'syntax error' when loading entity-files

Categories

(Core :: DOM: Core & HTML, defect, P5)

x86
Windows 7
defect

Tracking

()

UNCONFIRMED

People

(Reporter: terje.rosenlund, Unassigned)

Details

User-Agent:       Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1
Build Identifier: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1

XMLHttpRequest throws 'syntax error' when loading entity-files like HTMLlat1.ent, HTMLsymbol.ent and HTMLspecial.ent

syntax error
<!ENTITY ...
--^

Headers fetched by getAllResponseHeaders():

	Accept-Ranges: bytes
	Content-Length: 11956
	Content-Type: text/xml-external-parsed-entity

Quickfix is to use overrideMimeType('text/plain');

	trx.Parser.prototype.load = function(url) {
		if (!url) {
			url = document.location.href;
		}
		var _httpRequest = new XMLHttpRequest;
		try {
			_httpRequest.open("GET", url, false);
			_httpRequest.overrideMimeType('text/plain');
			_httpRequest.send(null);
			return _httpRequest.responseText;
		} 
		catch (e) {}
		return '';
	};

Reproducible: Always
The issue is that the type contains "xml" in it (so we try to parse it with an XML parser), but the actual data is not XML data, so you get a parse error.  Then we report the parse error.

Other than the 'type contains "xml"' check maybe being too loose, this seems fine to me.
Component: HTML: Parser → DOM
QA Contact: parser → general
The XHR2 draft proposes that parsing only happen if the type is "text/html, text/xml, application/xml, or ends in +xml".
IIRC, that matches WebKit
W3C claims that xhtml1-versiones of ent-files basicly are the same as the SGML versions made xml-compatible (eg. no dtd-comments)
And: 'XHTML documents are XML conforming. As such, they are readily viewed, edited, and validated with standard XML tools' (http://www.w3.org/TR/xhtml1/#docconf)

- Why does XMLHttpRequest trow the same error when loading xhtml1-strict.dtd then?

Also: The spec for XML 1.0 on http://www.w3.org/TR/REC-xml/#NT-NameStartChar lists [#x10000-#xEFFFF] as valid start-chars, 0x10021 = !
	
	- <!ENTITY ...> is a valid tagname according to xml-spec then?

Apache-server returns mime-type based on extention so both SGML- and xhtml1-versions of entity-files are served as mimetype text/xml-external-parsed-entity 
Only the xhtml1-versions are (supposed to be) well formed xml and SGML-versions are garantied to choke any pure xml-parser (eg. on -- dtd comments --)

	- Does'nt that imply that XMLHttpRequest MUST skip parsing theese files?

There is (at least) one thing I dont understand regarding w3c, dtd's and xml: 
All dtd-tags are selfclosing (empty) tags but are terminated without forward slash before end-tag (ie. not closed)

- How can it be 'XML conforming' and valid to terminate an empty xml-tag without / ?
(http://www.w3.org/TR/REC-xml/#dt-eetag)
> - Why does XMLHttpRequest trow the same error when loading xhtml1-strict.dtd
> then?

Includes eg. xhtml-lat1.ent
(In reply to myself in comment #4)
> W3C claims that xhtml1-versiones of ent-files basicly are the same as the
> SGML versions made xml-compatible
This do not mean that they are xml-files (dtd and ent files are not xml-files at all). When the parser used by XMLHttpRequest (wrongly) parses external entity files as xml:

<!ENTITY nbsp "&#160;">		Syntax error, due to ! (realy valid xml?)
<ENTITY nbsp "&#160;">		Not well formed, due to missing =
<ENTITY nbsp="&#160;">		No elements found, due to missing /
<ENTITY nbsp="&#160;"/>		No error, ok as xml but not as dtd-entity!
<!ENTITY nbsp="&#160;"/>	Still syntax error, due to ! (realy valid xml?)

No dtd's or external entity files can be parsed as xml
https://bugzilla.mozilla.org/show_bug.cgi?id=1472046

Move all DOM bugs that haven’t been updated in more than 3 years and has no one currently assigned to P5.

If you have questions, please contact :mdaly.
Priority: -- → P5
Component: DOM → DOM: Core & HTML
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.