Closed Bug 321564 Opened 16 years ago Closed 16 years ago

E4X syntax to handle file with <?xml, <!DOCTYPE, <!ATTLIST ... instructions

Categories

(Core :: JavaScript Engine, defect)

x86
Windows XP
defect
Not set
normal

Tracking

()

VERIFIED INVALID

People

(Reporter: BijuMailList, Unassigned)

References

Details

Attachments

(2 files, 1 obsolete file)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a1) Gecko/20051219 Firefox/1.6a1
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a1) Gecko/20051219 Firefox/1.6a1

Firefox breaks at following 2 cases even though they are valid XML. 

There should be a way in E4X syntax to handle a valid XML file with <?xml, <!DOCTYPE, <!ATTLIST ... instructions.

==== case 1
doc =
<?xml version="1.0"?>
<!DOCTYPE doc [
<!ATTLIST d id ID #IMPLIED>
]>
<doc>
  <d id="id3">Three</d>
</doc>;

==== case 2

str='\
<?xml version="1.0"?>\n\
<!DOCTYPE doc [\n\
<!ATTLIST d id ID #IMPLIED>\n\
]>\n\
<doc>\n\
  <d id="id3">Three</d>\n\
</doc>';

doc = new XML(str);

Reproducible: Always

Steps to Reproduce:
try e4x_parse_pi.html attachment

Actual Results:  
user get Syntax error

Expected Results:  
should create E4X XML object
As far as I can tell from ECMA-357, the E4X spec doesn't allow <!DOCTYPE>. If you want E4X to change, you should take it up with the E4X working group at ECMA.
Status: UNCONFIRMED → RESOLVED
Closed: 16 years ago
Resolution: --- → INVALID
in that case why following not throwing error

doc = <doc>

<?xxxxx version="1.0"?>
<b>jhh</b>

</doc>
(In reply to comment #3)
> in that case why following not throwing error
> 
> doc = <doc>
> 
> <?xxxxx version="1.0"?>
> <b>jhh</b>
> 
> </doc>

Where's the error?

A processing instruction is not an error.  Only <?xml ...?> is reserved and required to come first.  Your example here shows a PI named xxxxx with one parameter.  That's not an error.

/be
Status: RESOLVED → VERIFIED
Then following should be valid..

doc =<?xml version="1.0"?>
<doc>
  <d id="id3">Three</d>
</doc>;

But gives: SyntaxError: unterminated regular expression literal

and 

str='<?xml version="1.0"?>\n\
<doc>\n\
  <d id="id3">Three</d>\n\
</doc>'

doc = new XML(str);

But gives: SyntaxError: xml is a reserved identifier
(In reply to comment #5)
> Then following should be valid..
> 
> doc =<?xml version="1.0"?>
> <doc>
>   <d id="id3">Three</d>
> </doc>;
> 
> But gives: SyntaxError: unterminated regular expression literal

Please attach a testcase that reproduces this error.  In the js shell, I see:

js> doc =<?xml version="1.0"?>

js> <doc>
  <d id="id3">Three</d>
</doc>;
<doc>
  <d id="id3">Three</d>
</doc>
js>
js> doc.toXMLString()

js> XML.ignoreProcessingInstructions = false
false
js> doc =<?xml version="1.0"?>
<?xml version="1.0"?>
js> doc.toXMLString()
<?xml version="1.0"?>

Per ECMA-357 11.1.4 and 8.3, and per ECMA-262 7.9.1, the first line is an assignment statement that ends in an <?xml?> processing instruction, after which on the next line is a separate XML literal.  This causes automatic semicolon insertion to kick in, so the first line assigns only the XML declaration to doc.

And, per ECMA-357 13.4.3.3 and 10.3.2.1 step 4, XML.ignoreProcessingInstructions defaults to true, so the XML declaration (syntactically, a PI) is ignored.

> and 
> 
> str='<?xml version="1.0"?>\n\
> <doc>\n\
>   <d id="id3">Three</d>\n\
> </doc>'
> 
> doc = new XML(str);
> 
> But gives: SyntaxError: xml is a reserved identifier

Welcome to the wonders of E4X.  This is exactly what ECMA-357 specifies, since it says in 10.3.1 to wrap the string to be converted to XML, before parsing it, with "<parent xmlns='%s'>" and "</parent>", with %s expanded to the default namespace. But of course, the XML declaration must come first per the XML specs, so there is no way to construct using new XML() with an XML declaration at the front of the string argument.

This is a bug in the spec, but I didn't rejoin ECMA TG1 in time to fix it, and it is still in the ISO version of the spec.  It's probably the most-dup'ed bug that we've tracked against E4X, although separate reports were not marked duplicates until just now.  See bug 290525; see also bug 277683 comment 8 et seq.

I will make an effort to get ECMA TG1 to fix this in the next major revision of the E4X spec.

/be
Depends on: 290525
Attached file e4x_parse_pi_2.html (obsolete) —
please see e4x_parse_pi_2.html 

Also I am not particular to get this fixed.

I looking for the following one or other major issues to be fixed. 
If the spec dont say it, make the spec say it. 
Because till now we were told that is a valid XML 

==== case 2

str='\
<?xml version="1.0"?>\n\
<!DOCTYPE doc [\n\
<!ATTLIST d id ID #IMPLIED>\n\
]>\n\
<doc>\n\
Attached file e4x_parse_pi_2.html
Attachment #206914 - Attachment is obsolete: true
You are wasting time here -- ecma-international.org is not mozilla.org.  Filing bugs here demanding changes to an ECMA, and now ISO, standard specification is wrong.  I'm not happy with E4X either, and I said I'll see about getting the TG1 group to fix the next version of the spec, but that will take a while.  In the mean time, I believe Aaron Boodman is going to file a separate bug on just the new XML("<?xml...?> ...") issue.

If you could file a separate bug on the <!DOCTYPE and <!ATTLIST issues, filing it as a request for enhancement, that would help.  Mixing all these up here as if the current spec were not implemented correctly does not help.

/be
(In reply to comment #5)
> Then following should be valid..
> 
> doc =<?xml version="1.0"?>
> <doc>
>   <d id="id3">Three</d>
> </doc>;
> 
> But gives: SyntaxError: unterminated regular expression literal

Thanks for attaching the right testcase to reproduce this exception, which is not what you showed above, but what is in attachment 206915 [details]:

doc =<?xml version="1.0"?><doc><d id="id3">Three</d></doc>;

This error is correct according to the specifications, because again (see comment 6 for the chapter and verse citations), an XMLInitialiser is either one XMLMarkup (which is either a comment, CDATA section, or PI) or one XMLElement -- not a PI followed by an element as you have written here.

Now if you run together two XMLInitialisers, the first is parsed as you expect, but the second is not, because < in an operator context is the less-than operator, not the XML STAGO delimiter.  Only in operand context is < an STAGO.  To simplify the example:

  x = <x/><y/>;

is a syntax error, because <x/> is parsed as the left operand of the < less-than operator, which leaves y/> as the right operand and trailing right context.

So as usual, the y identifier is taken as the right operand, and the /> trailing context looks like the beginning of a regular expression.  But since this would-be regular expression literal starting with /> is not closed by another / before the end of line, the expected "unterminated regular expression literal" SyntaxError exception is thrown.

/be
(In reply to comment #9)
> If you could file a separate bug on the <!DOCTYPE and <!ATTLIST issues, filing
> it as a request for enhancement, that would help.

created bug# 321685

tnx a lot for taking time for explaining
You need to log in before you can comment on or make changes to this bug.