XML Parsing Error if <xml> tag is not at very beginning of xml file (doesn't allow whitespace before tag)

VERIFIED INVALID

Status

()

Core
XML
VERIFIED INVALID
16 years ago
15 years ago

People

(Reporter: Will Chiong, Assigned: Heikki Toivonen (remove -bugzilla when emailing directly))

Tracking

Trunk
x86
Windows XP
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

16 years ago
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.2.1) Gecko/20021130
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.2.1) Gecko/20021130

Any xml file with whitespace before the beginning <xml> tag will cause the
following parsing error in mozilla, and will not display the xml file at all.

XML Parsing Error: xml processing instruction not at start of external entity

This is not the standard behavior for other XML viewers, and many of the XML
files used by my company happen to have a space or a carriage return at the
beginning of the files.  Consequently, I have to use Internet Explorer to view
all of these XML files, when I would rather use your program.

Reproducible: Always

Steps to Reproduce:
1. Take a normal xml file that works in mozilla.
2. Insert a space or carriage return at the very beginning of the file.
3. Try to open that file again in mozilla.

Actual Results:  
The XML does not display, and I get an error message

XML Parsing Error: xml processing instruction not at start of external entity

Expected Results:  
Mozilla should ignore the whitespace and display the xml message properly.

n/a
The XML grammar does not allow whitespace before the '<?xml' decl.  See 
http://www.w3.org/TR/REC-xml#NT-document and
http://www.w3.org/TR/REC-xml#NT-prolog for the production definitions.  Note the
absense of a way to have the 'S' production before the '<?xml' production.

Any XML document that has whitespace before the '<?xml' is thus not well-formed
and should trigger a well-formedness error.  Any parser that does not trigger
such an error is buggy, imo.
Status: UNCONFIRMED → RESOLVED
Last Resolved: 16 years ago
Resolution: --- → INVALID
Boris is correct, verified invalid.
Status: RESOLVED → VERIFIED

Comment 3

16 years ago
And if you want to know why this is important not to have whitespace before the
first tag, look at http://www.w3.org/TR/REC-xml#sec-guessing-no-ext-info :
second table (auto-detecting without a bytemark).
(Reporter)

Comment 4

16 years ago
Thanks for the correction.  I've raised the issue with my company, and hopefully
they'll make changes to the legacy files to become more standards compliant.  
*** Bug 214569 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.