Last Comment Bug 246441 - (e4x) Implement E4X in SpiderMonkey
(e4x)
: Implement E4X in SpiderMonkey
Status: RESOLVED WONTFIX
: js1.5
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: Trunk
: All All
: P1 enhancement with 11 votes (vote)
: mozilla1.8beta2
Assigned To: general
:
:
Mentors:
http://www.ecma-international.org/pub...
Depends on: 40757 270553 289630 292863 297025 312354 315048 315657 317294 321685 322005 325750 330572 333678 334310 334452 334628 335051 335052 335424 336355 336551 336921 347270 348332 350532 352799 355233 355257 355258 355422 355674 366483 373594 373994 379525 379846 380946 381197 382339 384435 386388 389123 410192 410263 410366 410740 452729 490758
Blocks: 265341 277887 290393 435398
  Show dependency treegraph
 
Reported: 2004-06-11 17:55 PDT by Brendan Eich [:brendan]
Modified: 2014-04-26 03:11 PDT (History)
47 users (show)
bob: in‑testsuite-
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
work in progress, parses xml but doesn't yet codegen (421.17 KB, patch)
2004-08-20 17:13 PDT, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
checkpoint (450.24 KB, patch)
2004-08-23 19:14 PDT, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
compressed checkpoint patch, nearly ready to branch (109.72 KB, application/x-gzip)
2004-09-30 22:38 PDT, Brendan Eich [:brendan]
no flags Details
Errata so far (3.79 KB, text/plain)
2004-09-30 22:41 PDT, Brendan Eich [:brendan]
no flags Details
compressed checkpoint patch, landing today (115.42 KB, application/x-gzip)
2004-10-04 15:14 PDT, Brendan Eich [:brendan]
no flags Details
latest errata file (7.43 KB, text/plain)
2004-10-05 10:20 PDT, Brendan Eich [:brendan]
no flags Details
checkpoint, passing some e4x tests now (with more errata found) (387.57 KB, patch)
2004-10-26 18:18 PDT, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
checkpoint, more tests passed, closing in (460.93 KB, patch)
2004-11-03 19:16 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
checkpoint, closer still, only a few not-yet-implemented errors left (411.17 KB, patch)
2004-11-11 19:45 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
checkpoint, just about done (424.34 KB, patch)
2004-11-12 16:21 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
version for 1.8a5, ifdef'd off still (116.03 KB, application/x-gzip)
2004-11-16 16:47 PST, Brendan Eich [:brendan]
no flags Details
fix a spec bug that I faithfully transcribed into a code bug (1.16 KB, patch)
2004-11-18 19:22 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
latest errata file (12.11 KB, text/plain)
2004-11-18 19:25 PST, Brendan Eich [:brendan]
no flags Details
better approach (2.22 KB, patch)
2004-11-19 18:29 PST, Brendan Eich [:brendan]
shaver: superreview+
brendan: approval1.8a5+
Details | Diff | Splinter Review
js/tests/e4x patch to fix test bugs and glitches (8.52 KB, patch)
2004-11-22 19:13 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
latest errata file (12.59 KB, text/plain)
2004-11-22 19:20 PST, Brendan Eich [:brendan]
no flags Details
updated e4x testsuite patch (16.85 KB, patch)
2004-11-24 17:30 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
patch I just checked in (58.28 KB, patch)
2004-11-24 19:10 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
Sample showing <> error lateness (65 bytes, application/x-javascript)
2004-11-27 19:05 PST, Nigel McFarlane
no flags Details
Sample showing <> error lateness (65 bytes, application/x-javascript)
2004-11-27 19:07 PST, Nigel McFarlane
no flags Details
latest e4x testsuite patch (19.60 KB, patch)
2004-11-29 12:33 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
scanner tweaks to improve diagnostics slightly; XML/XMLList parent() fixes to match spec and fixed tests (6.56 KB, patch)
2004-11-29 12:38 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
toXMLString() comment serialisation problem (152 bytes, text/plain)
2004-12-04 23:19 PST, Nigel McFarlane
no flags Details
Entity reference serialization problems (142 bytes, text/plain)
2004-12-04 23:43 PST, Nigel McFarlane
no flags Details
latest errata file (14.24 KB, text/plain)
2004-12-10 11:08 PST, Brendan Eich [:brendan]
no flags Details
latest errata file (14.79 KB, text/plain)
2004-12-22 16:27 PST, Brendan Eich [:brendan]
no flags Details
add JS_GetMethod API, use it in nsXPCWrappedJSClass::CallMethod (3.61 KB, patch)
2004-12-28 16:20 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
further fixes for dmose's JS component re-registration case (2.17 KB, patch)
2004-12-29 19:11 PST, Brendan Eich [:brendan]
shaver: review+
Details | Diff | Splinter Review
patch for bug pointed out in comment 71 (1.46 KB, patch)
2004-12-29 19:42 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review

Description Brendan Eich [:brendan] 2004-06-11 17:55:58 PDT
We should support ECMAScript for XML (E4X), which is a draft ECMA standard.

/be
Comment 1 Mike Shaver (:shaver -- probably not reading bugmail closely) 2004-06-13 07:50:19 PDT
Where's the spec?
Comment 2 Brendan Eich [:brendan] 2004-06-13 13:05:18 PDT
I'll try to pry it loose tomorrow at the ECMA TG1 meeting.

/be
Comment 3 Mike Shaver (:shaver -- probably not reading bugmail closely) 2004-06-13 19:56:55 PDT
http://lambda.weblogs.com/discuss/msgReader$6380 has some good feedback on the
E4X spec, much of which I agree with.  Maybe we can remedy some of them in the
JS24X stuff?
Comment 4 Brendan Eich [:brendan] 2004-08-20 17:13:12 PDT
Created attachment 156645 [details] [diff] [review]
work in progress, parses xml but doesn't yet codegen

But @a::b, *, etc. work.

I'm compiling a big E4X errata sheet; I'll mail that to the right folks next
week.

/be
Comment 5 Brendan Eich [:brendan] 2004-08-23 19:14:33 PDT
Created attachment 156860 [details] [diff] [review]
checkpoint
Comment 6 Brendan Eich [:brendan] 2004-09-30 22:38:05 PDT
Created attachment 160708 [details]
compressed checkpoint patch, nearly ready to branch
Comment 7 Brendan Eich [:brendan] 2004-09-30 22:41:03 PDT
Created attachment 160709 [details]
Errata so far

Possibly some of these claimed errata are erroneous -- there sure are a lot. 
The spec seems like a somewhat buggy translation of slightly buggy Java code,
which is unfortunate, but possibly better than a spec based on no
implementation!

/be
Comment 8 Igor Bukanov 2004-10-01 00:56:49 PDT
Another erata:

In E4X specs the section 11.5.1 The Abstract Equality Comparison Algorithm states:

   ...
   3. If Type(x) is the same as Type(y)
   ...
        c. If Type(x) is Object and x.[[Class]] == "Namespace", return the
results of the comparison x.uri == y.uri
    
That should be changed to 
        c. If Type(x) is Object and x.[[Class]] == "Namespace" and y.[[Class]]
== "Namespace", return the results of the comparison x.uri == y.uri

A similar treatment should be applied to QName part.


Note also that the algorithm can be simplified if [[Equals]] would be defined to
return triple true|false|null where null indicates that the object does not know
how to do the equality. Then the method can be defined for Namespace and QName
to return null if the argument is not Namespace/QName or do the proper job and
similarly for XML.

Then 1-2-3-4 in the algorithm can be replaced by:

1. If x implements [[Equals]],
  a. Call x.[[Equals]](y)
  b. If result of a is not null, return it
2. If y implements [[Equals]],
  a. Call y.[[Equals]](x)
  b. If result of a is not null, return it

Comment 9 Brendan Eich [:brendan] 2004-10-01 12:30:50 PDT
Sorry for sounding harsher than I intended in comment 7.  Consider that JS was
more than slightly buggy in the old days, and ECMA standardization of it
involved at least two other implementations.  I think E4X will benefit from
other attempts to implement the spec, and I hope the errata here are helpful.

/be
Comment 10 Brendan Eich [:brendan] 2004-10-02 14:47:30 PDT
A note on a pretty minor glitch in spec, that I'd like to share here (I'll roll
it into the complete errata file, which needs to be HTML-ized and checked for
errors):

11.1.4 The sub-grammar here allows tag content of the form <T A={e1}{e2}>,
       but ExcapeAttributeValue is called only on the result of evaluating e1,
       not e2, and unless e2 evaluates to the empty string, the resulting
       string concatenation will not be well-formed XML because only the e1
       result is escaped and quoted.                                           
            
                         
       The spec clearly intends that one can have exactly one {expr} in lieu
       of a properly quoted and escaped attribute value, but the cover grammar
       does not require this, and it's up to the XML parser to try to catch
       such errors.
                                                                                
       This simplifies the E4X spec, but it deserves at least an informative
       comment, if not a tighter cover grammar.  Implementations where the XML
       parsing is built into the JS parser can be simplified based on the
       restriction, and we don't want to court interop bugs where an {e2}
       that evaluates to the empty string is allowed in one implementation but
       not in another.

I remember bugging John about the cover grammar issue just before ECMA-357 was
finalized, but I didn't have time to find this quirk, or ambiguity.  I should
have seen it then; better late than never.

/be
Comment 11 Brendan Eich [:brendan] 2004-10-04 15:14:59 PDT
Created attachment 161086 [details]
compressed checkpoint patch, landing today

Need to fix a few more bugs, test again, and land.  Current plan is to avoid a
branch by landing on the trunk, configured off.

/be
Comment 12 Brendan Eich [:brendan] 2004-10-05 03:23:01 PDT
Checked into cvs.mozilla.org trunk, configured off (via jsconfig.h).  See the
TODO comment near the top of jsxml.c.

/be
Comment 13 Brendan Eich [:brendan] 2004-10-05 10:20:09 PDT
Created attachment 161180 [details]
latest errata file

I included the erratum Igor gave in comment 8.	Feel free to add more to this
bug, and I'll roll them up.

/be
Comment 14 Igor Bukanov 2004-10-07 11:05:33 PDT
Here is another errata:

ECMA 357, 11.1 Primary Expressions
...
Semantics
The production PrimaryExpression : PropertyIdentifier is evaluated as follows:

1. Let n be the result of evaluating PropertyIdentifier
2. Let name = ToXMLName(n)
3. While (true)
  a. If there are no more objects on the scope chain,
    i. Return undefined
  ^^^^^^^^^^^^^^^^^^^^^^^^
  b. Let o be the next object on the scope chain.
     NOTE: on the first iteration, o will be the first object on the scope chain
  c. If Type(o)  {XML, XMLList}
    i. Let hasProp be the result of calling the [[HasProperty]] method of o,
passing name as the property
    ii. If hasProp == true
      1. Return a value of type Reference whose base object is o and whose
property name is name

The algorithm should not return undefined for unbound name. Rather in this case
it should return a value of type Reference whose base object is the first XML or
XMLList object found on the scope chain or null otherwise and whose property
name is <i>name</i> so the algorithm should read:

1. Let n be the result of evaluating PropertyIdentifier
2. Let name = ToXMLName(n)
3. Let firstXML = null
4. While (true)
  a. If there are no more objects on the scope chain,
    i. Return a value of type Reference whose base object is firstXML and whose
property name is name. 
      NOTE: firstXML will be null if no XML or XMLList value was found on the
scope chain.
  b. Let o be the next object on the scope chain.
     NOTE: on the first iteration, o will be the first object on the scope chain
  c. If Type(o)  {XML, XMLList}
    i. Let hasProp be the result of calling the [[HasProperty]] method of o,
passing name as the property
    ii. If hasProp == true
      1. Return a value of type Reference whose base object is o and whose
property name is name
    iii. If firstXML == null
      1. Set firstXML to o 

Without the change while script consisting of the single "x" generates
ReferenceError, its E4X modifications, *::x or @x results in undefined.

In addition the following would not work if xmlList contains an element without
child subtag:

xmlList.(child.@name == 'something') 

as child would evaluates to undefined resulting to ReferenceError on child.@name.
Comment 15 Brendan Eich [:brendan] 2004-10-08 14:33:58 PDT
Igor, thanks for writing that up -- it was in the errata file at attachment
161180 [details] already at the top, in cryptic form: "??? @a::b => undefined if not found
in scope chain !?!?"  I noted this while implementing but forgot to go back and
explain that note.

/be
Comment 16 Igor Bukanov 2004-10-13 03:23:28 PDT
Erratum in 13.5.4.18 XMLList.prototype.propertyIsEnumerable ( P ):

The first step in the semantic algorithms should be changed from:

1. if ToNumber(P) is greater than or equal to 0 and ToNumber(P) is less than
x.[[Length]], return true

to

1. Let i = ToUint32(P)
2. If ToString(i) == P
  a. If i is greater than or equal to 0 and i is less than x.[[Length]], return true
  b. Otherwise return false
  
  
Without the change x.propertyIsEnumerable(0.1) or even
x.propertyIsEnumerable(-0.0) (depending on the meaning of "or equal to 0") can
return true. 
Comment 17 Igor Bukanov 2004-10-13 03:50:48 PDT
Possible errata in XMLList.prototype.child

13.5.4.5 defines XMLList.prototype.child(propertyName) to return empty list if
the list itself is empty as well without any checks for valid propertyName. 

That means that x.child(null) or x.child(undefined) returns empty list if x is
empty and throw TypeError when x is not empty during the eventual call to
toXMLName(null|undefined).

Compare that to 13.4.4.4 XML.prototype.attribute(attributeName) which always
throws TypeError when attributeName is null or undefined.

Should XMLList.prototype.child always require valid name even when the list is
empty?
Comment 18 Brendan Eich [:brendan] 2004-10-26 18:18:09 PDT
Created attachment 163500 [details] [diff] [review]
checkpoint, passing some e4x tests now (with more errata found)

It seems > does not need to be escaped in element values (text), and the test
in e4x/Expressions/11.1.4.js wants it not to be escaped -- this is a
ToXMLString erratum.  Also, an element's attribute and namespace uri values are
not escaped, but should be.

More when I get back from the ECMA TG1 meeting.

/be
Comment 19 Brendan Eich [:brendan] 2004-11-03 19:16:19 PST
Created attachment 164502 [details] [diff] [review]
checkpoint, more tests passed, closing in
Comment 20 Brendan Eich [:brendan] 2004-11-11 19:45:27 PST
Created attachment 165651 [details] [diff] [review]
checkpoint, closer still, only a few not-yet-implemented errors left

Plus some lurking bugs, and some bogus tests in the suite.  Testsuite errata
doc soon, along with updated spec errata.

Reduced context from -pu8 to fit within bugzilla's 500K patch size limit.

/be
Comment 21 Brendan Eich [:brendan] 2004-11-12 16:21:25 PST
Created attachment 165754 [details] [diff] [review]
checkpoint, just about done

Impurity lurking, plus testsuite bugs (I think) that give false failures in six
tests (could be real bugs in the SpiderMonkey e4x lurking still, causing one or
two of these):

# Retest List, smdebug, generated Fri Nov 12 16:13:00 2004.
# Original test base was: e4x.
# 129 of 129 test(s) were completed, 6 failures reported.
e4x/Expressions/11.6.1.js
e4x/Types/9.1.1.9.js
e4x/XML/13.4.3.js
e4x/XML/13.4.4.38.js
e4x/XML/13.4.4.39.js
e4x/Regress/regress-264369.js

/be
Comment 22 Brendan Eich [:brendan] 2004-11-16 16:47:24 PST
Created attachment 166178 [details]
version for 1.8a5, ifdef'd off still
Comment 23 Brendan Eich [:brendan] 2004-11-16 23:50:34 PST
I checked in something close to attachment 166178 [details], rs=shaver.  It passes the
js/tests suite, excluding e4x, configured without E4X support (the default for now).

With JS_HAS_XML_SUPPORT defined to 1 for JS_VERSION == 150 in jsconfig.h, it
passes all e4x tests but the six noted in comment 21, and seems pure undef
Purify on Windows (although I had some "ghost" leaks that were once real, then
fixed, but persisted even after the fixes, which demonstrably eliminated all
such leaks using the DEBUG_notme code in jsxml.c).

I'm leaving this bug open to work on further testing, correctness, footprint and
performance wins necessary to justify turning on E4X by default.  One idea: use
NSPR's dlopen/dlsym portable veneer to make the E4X support runtime-conditional,
and put the bulk of the code in a separate DLL/DSO.  Before resorting to that,
I'm going to try to recover space throughout SpiderMonkey, by reducing the use
of macros (optimizations done in the '90s on RISC targets, mostly, and possibly
counterproductive on superscalar x86 machines now).

/be 
Comment 24 Brendan Eich [:brendan] 2004-11-16 23:58:41 PST
Oh, and I put off thread safety for "later".

The plan is to avoid penalizing the all single-threaded use-cases and embeddings
by building on the zero-cost locking implemented in bug 54743.  In the case of
E4X, this isn't trivial, because just using a per-XML-object zero-cost lock
doesn't avoid nested locks and AB-BA deadlock hazards inherent in XMLLists,
[[Insert]], [[Append]], and [[Replace]].  Just for starters!

/be
Comment 25 Nigel McFarlane 2004-11-17 23:33:54 PST
Feedback in bug 270552 and bug 270553.

- N.
Comment 26 Brendan Eich [:brendan] 2004-11-18 19:22:45 PST
Created attachment 166406 [details] [diff] [review]
fix a spec bug that I faithfully transcribed into a code bug

Updated errata file next.

/be
Comment 27 Brendan Eich [:brendan] 2004-11-18 19:25:07 PST
Created attachment 166407 [details]
latest errata file

I really should clean this up and HTML-ize it, perhaps tomorrow.

/be
Comment 28 Brendan Eich [:brendan] 2004-11-18 19:28:36 PST
Comment on attachment 166406 [details] [diff] [review]
fix a spec bug that I faithfully transcribed into a code bug

I'd like to fix this for 1.8a5, just for the early adopters, but it's not a big
deal either way -- remember that E4X is configured off by default, and not
hooked up to the <script> tag by a MIME type parameter yet.

John, check my erratum claim here.  It sure seems like the
[[Replace]]-calls-[[Insert]] case, replacing at or beyond the length with a a
list value, will create a hole and an off-by-1 x.[[Length]].

/be
Comment 29 Brendan Eich [:brendan] 2004-11-18 19:31:46 PST
Comment on attachment 166406 [details] [diff] [review]
fix a spec bug that I faithfully transcribed into a code bug

Optimizing, since this is all #if'd off for 1.8a5.

/be
Comment 30 Asa Dotzler [:asa] 2004-11-18 20:17:22 PST
Comment on attachment 166406 [details] [diff] [review]
fix a spec bug that I faithfully transcribed into a code bug

a=asa for 1.8a5 checkin.
Comment 31 Brendan Eich [:brendan] 2004-11-19 16:36:30 PST
Comment on attachment 166406 [details] [diff] [review]
fix a spec bug that I faithfully transcribed into a code bug

Thanks, but there's a better patch that probably does what the spec should do:
avoid incrementing x.[[Length]] until we know the value being inserted at the
end is not an XMLList.

/be
Comment 32 Brendan Eich [:brendan] 2004-11-19 18:29:18 PST
Created attachment 166536 [details] [diff] [review]
better approach

I'll check this into 1.8a5.  Reviews welcome, after the fact! ;-)

/be
Comment 33 Brendan Eich [:brendan] 2004-11-19 18:32:07 PST
Comment on attachment 166536 [details] [diff] [review]
better approach

Restoring Asa's stamp.

/be
Comment 34 Brendan Eich [:brendan] 2004-11-22 19:13:24 PST
Created attachment 166821 [details] [diff] [review]
js/tests/e4x patch to fix test bugs and glitches

Who should review this patch?

/be
Comment 35 Brendan Eich [:brendan] 2004-11-22 19:20:05 PST
Created attachment 166822 [details]
latest errata file

I didn't get the time to HTML-ize this yet, but it grew a bit.	Diff against
the last one:

94a95,106
> 9.2.1.2 step 2(c)(ii) sets _y.[[Parent]] = r_ where _r_ is the result of
>	  [[ResolveValue]] called on _x.[[TargetObject]] in 2(a)(i).  This can
>	  result in text parenting text:
>
>	  var MYXML = new XML();
>	  MYXML.appendChild(new XML("<TEAM>Giants</TEAM>"));
>
>	  (testcase from Werner Sharp <wsharp@macromedia.com>).
>
>	  To match insertChildAfter, insertChildBefore, prependChild, and
>	  setChildren, we should silently do nothing in this case.
>

/be
Comment 36 Brendan Eich [:brendan] 2004-11-24 17:30:42 PST
Created attachment 166969 [details] [diff] [review]
updated e4x testsuite patch

- Vacuously ported e4x/shell.js's NL function to SpiderMonkey.
									       

- Fixed bugs (deviations from the ECMA-357 spec) in e4x/Expressions/11.6.1.js
  sections 3 and 6.
									       

- Fixed backward use of ignoreWhitespace (or backward sense of equality test,
  but I went with minimal change here) in e4x/Types/9.1.1.9.js section 9.
									       

- e4x/XML/13.4.3.js repeated used XML.defaultSettings() where it wanted
  XML.setSettings() -- this suggests that the verb-less "defaultSettings"
  name is confusing, because hackers love to verb nouns, and "default" is
  often used as a verb, so XML.defaultSettings connotes "set defaults", not
  "get defaults".
									       

- e4x/XML/13.4.4.38.js sets XML.prettyPrinting = false but then expects
  XMLLists to use NL() as the list element separator, contrary to 10.2.2
  Step 2(a).  Same goes for 13.4.4.39.js.
									       

- Added missing newline at end of file to a bunch of tests.
									       

- Fixed misspelled "arbitrary" in e4x/Regress/regress-257679.js.

I'd like to check this in soon, but not before someone reviews the substantive
changes and agrees (and files any needed Rhino E4X bugs).  Igor, are you game?

/be
Comment 37 Brendan Eich [:brendan] 2004-11-24 19:10:39 PST
Created attachment 166981 [details] [diff] [review]
patch I just checked in

to finish off several PutProperty bugs, a few other odd bugs, and fixes based
on the recently discovered errata.

/be
Comment 38 Brendan Eich [:brendan] 2004-11-24 19:16:26 PST
Breaking out the E4X <=> DOM hookup to the bug that Nigel kindly filed.  This
bug can track that, and accumulate errata and testsuite fixes, for a while.

When the DOM hookup is done, I'd like to close both this bug and bug 270553, and
let any further problems (which should arise at much lower rate) be tracked by
individual bugs.

For regression testing purposes, it would have been ideal to make separate bugs
for each erratum or SpiderMonkey bug fixed by the last few patches attached
here, but that would have slowed things down to a crawl.  The consolidated
errata file can help tell us where we need to add more tests, instead of a list
of micro-bugs to treat as "regressions".

/be
Comment 39 Igor Bukanov 2004-11-25 02:55:00 PST
(In reply to comment #36)
> Created an attachment (id=166969)
> updated e4x testsuite patch
> 
> I'd like to check this in soon, but not before someone reviews the substantive
> changes and agrees (and files any needed Rhino E4X bugs).  

The changes triggered test failures in Rhino but AFAICS these are real bugs.
Note also that e4x/XMLList/13.5.4.16.js contains:

x = new XML();
TEST(2, undefined, x.parent());

Should it be changed to null?
> /be

Comment 40 Nigel McFarlane 2004-11-25 05:53:37 PST
Brief feedback.

Errata: Section 8 of E4X incorrectly pluralizes "Punctuators"
in the second paragraph.

Bugs: js -x could error more gracefully and earlier when presented
with this test data:

var a = <>;
// put anything lexable here

Seen with Linux/gcc 3.3.3-7, latest trunk.

- N.
Comment 41 Brendan Eich [:brendan] 2004-11-25 10:00:41 PST
Igor wrote:

> x = new XML();
> TEST(2, undefined, x.parent());
> 
> Should it be changed to null?

Yes, to match the spec.  I changed SpiderMonkey's E4X to match the testsuite the
other week, and noted the confusion in the errata file:

13.5.4.16 Steps 1 and 3 specify undefined return, not null as is done for
          XML.prototype.parent().  The testsuite wants null for "no parent"
          from XMLList.prototype.parent().

It's handy to be able to distinguish "not there" from "null" in cases such as
this, so I think the spec has it right.

I'll post an updated testsuite patch with this fix, and a SpiderMonkey E4X patch
to match, within the next day or two.

/be
Comment 42 Nigel McFarlane 2004-11-27 19:05:18 PST
Created attachment 167237 [details]
Sample showing <> error lateness

Now that it's not the wee hours, here's a more coherent report.
Error from this attachment reports Line 5 not 2:

5: SyntaxError: missing } in XML expression:
5:   flag = false;
5: ..............^

- N.
Comment 43 Nigel McFarlane 2004-11-27 19:07:57 PST
Created attachment 167238 [details]
Sample showing <> error lateness

Now that it's not the wee hours, here's a more coherent report.
Error from this attachment reports Line 5 not 2:

5: SyntaxError: missing } in XML expression:
5:   flag = false;
5: ..............^

- N.
Comment 44 Brendan Eich [:brendan] 2004-11-29 12:32:14 PST
Comment on attachment 167238 [details]
Sample showing <> error lateness

The original Netscape HTML parser went to great lengths to cope with unclosed
tags and improperly terminated entities.  Of course here we don't want to
"correct" errors by auto-closing tags, but how much code footprint do we want
to throw at diagnosing these issues?

Food for thought.

/be
Comment 45 Brendan Eich [:brendan] 2004-11-29 12:33:43 PST
Created attachment 167361 [details] [diff] [review]
latest e4x testsuite patch

Fixes tests to-do with XML.prototype.parent() and XMLList.prototype.parent().

SpiderMonkey patch next.

/be
Comment 46 Brendan Eich [:brendan] 2004-11-29 12:38:20 PST
Created attachment 167362 [details] [diff] [review]
scanner tweaks to improve diagnostics slightly; XML/XMLList parent() fixes to match spec and fixed tests

I'm checking this in today, along with the testsuite patch.

/be
Comment 47 Nigel McFarlane 2004-12-04 23:19:33 PST
Created attachment 167900 [details]
toXMLString() comment serialisation problem

Trivial case showing non-comment serialization in toXMLString()

- N.
Comment 48 Nigel McFarlane 2004-12-04 23:43:12 PST
Created attachment 167901 [details]
Entity reference serialization problems

Trivial test case showing entity reference loss with to[XML]String().

- N.
Comment 49 Brendan Eich [:brendan] 2004-12-06 11:56:56 PST
Re: comment 47: comments are ignored by default (XML.ignoreComments defaults to
true -- set it to false to get the desired results).

Re: comment 48: that's a bug, I'll fix it (feel free to report such lesser
followup bugs separately from this bug).  Thanks,

/be
Comment 50 Brendan Eich [:brendan] 2004-12-06 14:13:58 PST
Fixed the bug reported in comment 48.  This leads to results that might be
considered surprising:
 
toString()
---------
&nbsp;
 
toXMLString()
---------
<tag>&amp;nbsp;</tag>

But that's per spec, and applies to any unknown entity (nbsp is not defined by
the XML spec).

/be
Comment 51 Hixie (not reading bugmail) 2004-12-06 15:03:45 PST
Just out of interest, why wouldn't:
   var xml = <tag>&nbsp;</tag>;
...cause a well-formedness error?
Comment 52 Brendan Eich [:brendan] 2004-12-06 15:19:55 PST
Hixie: cuz my integrated XML parser isn't doing that kind of well-formedness
check?  D'oh.  Easily fixed, thanks....

/be
Comment 53 Brendan Eich [:brendan] 2004-12-06 15:26:51 PST
So E4X is written without specifying XML apart from 1.0 (vs. 1.1, e.g. in what
makes an identifier an XMLName).  But it doesn't say anything about entity refs
that refer to entities other than the built-in XML 1.0 ones (amp, lt, gt, apos,
quot).

This means that an XML parser used in conjunction with a JS engine to implement
E4X could process <!ENTITY ...> declarations and expand entity references, but
XML.prototype.toXMLString would not convert these expansions back into
references. That means no round-tripping via toXMLString, which seems like a bug
to me.  Probably ECMA TG1 considered this already.

Anyway, 10.3.2.1 step 8 is quite clear, and SpiderMonkey's E4X should throw a
TypeError for any unexpanded entity ref.  I'll fix that now.

/be
Comment 54 Hixie (not reading bugmail) 2004-12-06 17:00:33 PST
E4X's XML language doesn't support <!ENTITY> at all, right? So to end up in this
position you'd need something like:

   document.createEntityReference('nbsp').toXML().toXMLString();

...right? That would presumably just return a string consisting of U+00A0,
assuming "document" is an HTML document or an XML document with the appropriate
entity defined. Do the specs define this well enough?
Comment 55 Brendan Eich [:brendan] 2004-12-08 10:30:15 PST
Hixie:

>    document.createEntityReference('nbsp').toXML().toXMLString();

There is no toXML in the DOM specs, but if we added such a thing, we'd have to
extend ECMA-357 as well.  10.3.2.1 MapInfoItemToXML step 8 says:

8. If i is an unexpanded entity reference information item
    a. Throw a TypeError exception

/be
Comment 56 Hixie (not reading bugmail) 2004-12-08 12:46:45 PST
> There is no toXML in the DOM specs

My bad, I misunderstood ECMA-357 section 10.3. Take this example instead:

   new XML(document.createEntityReference('nbsp')).toXMLString();


> 10.3.2.1 MapInfoItemToXML step 8 says:
> 
> 8. If i is an unexpanded entity reference information item
>     a. Throw a TypeError exception

By "unexpanded" I assume they mean any entity reference, as opposed to one that
has been flattened. I guess that works. So basically, entities in general aren't
handled by E4X. (That's a good thing. I wish they'd dropped entities other than
the numeric ones and the five predefined ones from XML in the first place.)
Comment 57 Brendan Eich [:brendan] 2004-12-10 11:08:41 PST
Created attachment 168427 [details]
latest errata file

This should include everyone's errata -- if not, please mail me.

/be
Comment 58 Igor Bukanov 2004-12-10 12:48:16 PST
Regarding erratas:

It would be nice if E4X clarifies whether XML.[[Get]] or Object.[[Get]] should
be used during prototype lookup if the head of the prototype chain is Object.
Consider:

function Foo() { }
Foo.prototype = XML.prototype;

var bar = new Foo();

var test = bar.toXMLString

The question is whether test should refer to the toXMLString method defined in
XML.prototype or should it be empty XMLList?

Here is ECMAScript for Object.[[Get]] from ECMA-262 v3, section 8.6.2.1:

--------------------------------------
8.6.2.1 [[Get]] (P)

When the [[Get]] method of O is called with property name P, the following steps
are taken:

1 . If O doesn't have a property with name P, go to step 4.
2 . Get the value of the property.
3 . Return
4 . If the [[Prototype]] of O is null, return undefined.
5 . Call the [[Get]] method of [[Prototype]] with property name P.
6 . Return Result(5).
--------------------------------------

Now E4X specs gives in ECMA-357, section 9.1.1.1, first paragraph:

--------------------------------------
9.1.1.1 [[Get]] (P)
Overview

The XML type overrides the internal [[Get]] method defined by the Object type.
...
--------------------------------------


Given the traditional definition of "override" this tells that the prototype
lookup should use XML.[[Get]] and in the example "test" should be empty XMLList.

On the other hand the 3-rd paragraph of 9.1.1.1 [[Get]] (P) states:

--------------------------------------
NOTE
Unlike the internal Object [[Get]] method, the internal XML [[Get]] method is
never used for retrieving methods associated with XML objects. E4X modifies the
ECMAScript method lookup semantics for XML objects as described in section 11.2.2.
--------------------------------------

which indicates that Object.[[Get]] should always call Object.[[Get]] and "test"
should alias XML.prototype.toXMLString method. Note that 11.2.2 does not
consider the case of prototype chain and simple refers to Object.[[Get]] definition.

So what should be the right thing? I would prefer the second interpretation
since as far as I can see it is the only way to get reference to methods defined
in XML.prototype to use them in continuation with Function.protype.apply, but it
is just my preference.



Comment 59 Brendan Eich [:brendan] 2004-12-16 12:12:26 PST
NCRs are implemented now, and all entity references other than the fab five
defined in XML 1.0 are illegal.

/be
Comment 60 Brendan Eich [:brendan] 2004-12-22 16:27:53 PST
Created attachment 169406 [details]
latest errata file

Diff with previous errata file on left:

113a114,124
> 9.2.1.2 Step 2(e)(i, ii),
> 9.2.1.2 Step 7(e)(i),
> 9.2.1.3 Step 2(b)(ii)(1)(a)
>	  All uses of a.[[Name]] for an attribute a in these sections that pass

>	  that QName object to [[Delete]] must pass an AttributeName cloned
from>	      a.[[Name]].  The [[Name]] internal property is always a QName
instance>	  and never an AttributeName or AnyName instance.  But
[[Delete]] will
>	  not operate on x.[[Attributes]] when given a QName by these sections,

>	  so a child could be wrongly deleted instead of the attribute of the
>	  same name.
>

/be
Comment 61 Brendan Eich [:brendan] 2004-12-22 16:33:16 PST
That was ugly -- I wish bugzilla would wrap long text better, or not at all if
it appears to be "cited" (diff > at start of line looks like mail citation line
mark) -- fudging the indentation so the diff is readable, to foster discussion:

113a114,124
> 9.2.1.2 Step 2(e)(i, ii),
> 9.2.1.2 Step 7(e)(i),
> 9.2.1.3 Step 2(b)(ii)(1)(a)
>     All uses of a.[[Name]] for an attribute a in these sections that pass
>     that QName object to [[Delete]] must pass an AttributeName cloned from
>     a.[[Name]].  The [[Name]] internal property is always a QName instance
>     and never an AttributeName or AnyName instance.  But [[Delete]] will
>     not operate on x.[[Attributes]] when given a QName by these sections,
>     so a child could be wrongly deleted instead of the attribute of the
>     same name.
>

The problem in these steps is that an attribute's [[Name]] is used as if it were
of (internal) type AttributeName, but it's specified in 9.1.1 that [[Name]] must
be null or a QName object.

/be
Comment 62 Brendan Eich [:brendan] 2004-12-22 16:37:56 PST
Oh, and I just enabled E4X at compile time, with shaver prodding me to action. 
The footprint hit is ~80k, but I have some wins coming, and this is a case where
we gain a feature that may be worth the cost, even if we can't save elsewhere to
try to reduce the size of the hit.

/be
Comment 63 Brendan Eich [:brendan] 2004-12-28 16:20:50 PST
Created attachment 169767 [details] [diff] [review]
add JS_GetMethod API, use it in nsXPCWrappedJSClass::CallMethod

dmose pointed out the followup bug patched here.  E4X breaks the invariant,
preserved until now for all JS objects, native or host: for all method m in an
object o, o.m(arguments) is equivalent to var f = o.m; f.apply(o, arguments). 
With E4X, you cannot extract a function-valued method from an XML instance.

In SpiderMonkey's E4X, you can extract a method using the function:: namespace,
but that extension has not made its way back to ECMA, and it doesn't, in any
case, help prevent the broken symmetry from rippling through other layers of C
or C++ code built on the JS API.  In this case, XPConnect.

Still seems like a lose to break the invariant for XML, but I haven't got a way
to unify the conflicting goals:

1.  For E4X, that every get of a property id that is not an index (in the case
of an XMLList; reserved and an error for XML) returns an XMLList, which may be
empty, to return all matching attributes or children.

2.  Preserving the JS method extraction invariant mentioned above, and related
ones (call as well as apply, manual invocation if the |this| parameter binding
to o in the examples above doesn't matter, or can be faked via o.m2 = f;
o.m2(arguments)).

Since 1 does not apply uniformly to all XML properties (get of an index id on
an XML object is illegal, reserved for future use; get on an XMLList returns
the indexed child, not an XMLList containing the indexed child), I would simply
add another special case to 1, preserving 2.  But it's a bit late for that --
dmose is using E4X for CalDAV hacking already.

/be
Comment 64 Brendan Eich [:brendan] 2004-12-28 19:35:02 PST
I checked in a slightly better version of attachment 169767 [details] [diff] [review], so dmose can (I
hope) get past the problem he's seeing.

/be
Comment 65 Igor Bukanov 2004-12-29 04:14:22 PST
(In reply to comment #63)
...
> In SpiderMonkey's E4X, you can extract a method using the function::namespace,
but that extension has not made its way back to ECMA, 

You can extract method in the standard E4X ( well, at least the way I read E4X
specs, see comment 58) via:

function getMethod(object, name)
{
    var Proxy = function() { }
    Proxy.prototype = object;
    var proxy = new Proxy();
    return proxy[name];
}

and then a.function::method can be done via getMethod(a, "method")

> 
> 1.  For E4X, that every get of a property id that is not an index (in the case
> of an XMLList; reserved and an error for XML) returns an XMLList, which may be
> empty, to return all matching attributes or children.
> 
> 2.  Preserving the JS method extraction invariant mentioned above, and related
> ones (call as well as apply, manual invocation if the |this| parameter binding
> to o in the examples above doesn't matter, or can be faked via o.m2 = f;
> o.m2(arguments)).

In Rhino the above is satisfied via allowing to apply "()" to XMLList which
remember sufficient history to be able to support var f = xmlobject.m;
f.apply(xmlobject, arguments) and related operations.


> 
> Since 1 does not apply uniformly to all XML properties (get of an index id on
> an XML object is illegal, reserved for future use; get on an XMLList returns
> the indexed child, not an XMLList containing the indexed child), I would simply
> add another special case to 1, preserving 2.  But it's a bit late for that --
> dmose is using E4X for CalDAV hacking already.
> 
> /be
Comment 66 Brendan Eich [:brendan] 2004-12-29 10:09:23 PST
> and then a.function::method can be done via getMethod(a, "method")

Clever!  But it still doesn't handle name collisions between prototype methods
and (user-defined) XML children.

> In Rhino the above is satisfied via allowing to apply "()" to XMLList which
> remember sufficient history to be able to support var f = xmlobject.m;
> f.apply(xmlobject, arguments) and related operations.

Again the problem is that someone may write XML of the form:

  x = <parent><name>...</name><child id="1">...</child>...</parent>;

and then you cannot call x.child() or x.name().

E4X really wanted to put XML.prototype.* methods delegated from XML instances
into a separate namespace, but instead specified so that they can't be extracted
easily or in general.  It sounded at the last TG1 meeting as though everyone
agreed with the idea of putting the methods in a separate namespace, which would
be automatically selected when calling.

Then at least the following issues arise:

(1) What URI for the namespace?  It would not be useful to use a URI that could
identify an XML schema, since no schema can describe the grammar for native
functions.  It would be better if it were a reserved "anti-URI", which could not
collide with any XML schema URI.

(2) Should there be a shorthand, _a la_ SpiderMonkey's function::child use of
the function reserved identifier, so that users don't have to repeatedly create
a function namespace (var fn = new Namespace("<anti-URI goes here>"); var f =
x.fn::child; etc.)?  Obviously I think so, although "function" isn't that short
(but at least it's reserved, and clear).

/be
Comment 67 Igor Bukanov 2004-12-29 12:48:32 PST
(In reply to comment #66)
> 
> Again the problem is that someone may write XML of the form:
> 
>   x = <parent><name>...</name><child id="1">...</child>...</parent>;
> 
> and then you cannot call x.child() or x.name().

But why? 

var f = x.child; f.apply(x); 
can be made to work since '()' can check for [[TargetProperty]]/[[TargetObject]]
of child to see if it come from xml and then check for Function.prototype as a
source of additional properties since x.child() is defined.

In fact it does not work in Rhino this way due to fixable bugs.


> 
> E4X really wanted to put XML.prototype.* methods delegated from XML instances
> into a separate namespace, but instead specified so that they can't be extracted
> easily or in general.  It sounded at the last TG1 meeting as though everyone
> agreed with the idea of putting the methods in a separate namespace, which would
> be automatically selected when calling.
> 
> Then at least the following issues arise:
> 
> (1) What URI for the namespace?  It would not be useful to use a URI that could
> identify an XML schema, since no schema can describe the grammar for native
> functions.  It would be better if it were a reserved "anti-URI", which could not
> collide with any XML schema URI.
> 
> (2) Should there be a shorthand, _a la_ SpiderMonkey's function::child use of
> the function reserved identifier, so that users don't have to repeatedly create
> a function namespace (var fn = new Namespace("<anti-URI goes here>"); var f =
> x.fn::child; etc.)?  Obviously I think so, although "function" isn't that short
> (but at least it's reserved, and clear).
> 
> /be

Comment 68 Igor Bukanov 2004-12-29 13:10:38 PST
(In reply to comment #66)
> Then at least the following issues arise:
> 
> (1) What URI for the namespace?  It would not be useful to use a URI that could
> identify an XML schema, since no schema can describe the grammar for native
> functions.  It would be better if it were a reserved "anti-URI", which could not
> collide with any XML schema URI.
> 
> (2) Should there be a shorthand, _a la_ SpiderMonkey's function::child use of
> the function reserved identifier, so that users don't have to repeatedly create
> a function namespace (var fn = new Namespace("<anti-URI goes here>"); var f =
> x.fn::child; etc.)?  Obviously I think so, although "function" isn't that short
> (but at least it's reserved, and clear).
> 
> /be

But why not to provide a global function, say methods(object), that would return
the method-only view of the object? Then one can write methods(x).child or even
methods(x)['child'] which is much more clear IMO then x[QName(fn, 'child')].

Comment 69 Brendan Eich [:brendan] 2004-12-29 13:57:06 PST
> can be made to work since '()' can check for [[TargetProperty]]/[[TargetObject]]
> of child to see if it come from xml and then check for Function.prototype as a
> source of additional properties since x.child() is defined.

The E4X spec says that the XMLList returned in that example should contain one
element for reach immediate child of the parent tag, and not include a function
object in the array.  We'd have to say where it goes, and it would make the list
inhomogeneous, breaking operators such as the filtering predicate.  It's yet
more asymmetry.

Second reason not to do this is that it adds random logic to the XML method
calling (really, getting -- GetMethod is factored from CallMethod in
SpiderMonkey, and could be in E4X spec) path, which is already complicated enough.

The Macromedia folks commented on how the magic-ness of E4X objects required
extra random logic to be added to generic and type-specific layers, and that was
true in SpiderMonkey and (IIRC) in Rhino too.  Simpler is better, and a
namespace isolates all functions and preserves homogeneity of XMLList [[Get]]
results.

> But why not to provide a global function, say methods(object), that would
> return the method-only view of the object? Then one can write
> methods(x).child or even methods(x)['child'] which is much more clear IMO
> then x[QName(fn, 'child')].

First, there's no need to write x[QName(fn, 'child')] -- x.fn::child works and
is the natural way to express it in E4X or ECMA Edition 4.

Second, a global function would have to create or find a safely reusable
methods-only view object.  That's ineffecient.

Third, a global function pollutes the global namespace.

Given namespaces in E4X and Edition 4, why would you not use them to handle this
asymmetry imposed by E4X's current design, which was based on the reasonable
goal mentioned in comment 63 (numbered 1 there): to have homogeneous lists of
child or attribute results from [[Get]]?

Using a global method is a "JS 1" type of hack (I should know, I perpetrated JS
1 and its hacks), given namespaces.  We should not spend much more space arguing
in this bug, so if you still feel strongly, write something up and attach it. 
I'll take it to the TG1 group.

/be
Comment 70 Igor Bukanov 2004-12-29 18:45:52 PST
(In reply to comment #69)
> > can be made to work since '()' can check for [[TargetProperty]]/[[TargetObject]]
> > of child to see if it come from xml and then check for Function.prototype as a
> > source of additional properties since x.child() is defined.
> 
> The E4X spec says that the XMLList returned in that example should contain one
> element for reach immediate child of the parent tag, and not include a function
> object in the array.  We'd have to say where it goes, and it would make the list
> inhomogeneous, breaking operators such as the filtering predicate.  It's yet
> more asymmetry.
> 
> Second reason not to do this is that it adds random logic to the XML method
> calling (really, getting -- GetMethod is factored from CallMethod in
> SpiderMonkey, and could be in E4X spec) path, which is already complicated enough.
> 
> The Macromedia folks commented on how the magic-ness of E4X objects required
> extra random logic to be added to generic and type-specific layers, and that was
> true in SpiderMonkey and (IIRC) in Rhino too.  Simpler is better, and a
> namespace isolates all functions and preserves homogeneity of XMLList [[Get]]
> results.
> 
> > But why not to provide a global function, say methods(object), that would
> > return the method-only view of the object? Then one can write
> > methods(x).child or even methods(x)['child'] which is much more clear IMO
> > then x[QName(fn, 'child')].
> 
> First, there's no need to write x[QName(fn, 'child')] -- x.fn::child works and
> is the natural way to express it in E4X or ECMA Edition 4.

I was refering to the case when property is not known at run time like in:

function callit(object, method_name)
{
    return object[method_name]();
}

then to prepare it for E4X I first thought it would have to be modified as:

function callit(object, method_name)
{
  if (typeof object == "xml") {
    return object[new QName(fn, method)]();
  }
  return object[method]();
}

which would be more verbouse then

function callit(object, method_name)
{
  return methods(object)[method]();
}

but then I forgot that :: can be followed by [] so callit would look like:

function callit(object, method_name)
{
  return object.function::[method]();
}

which is nice and I guess now function:: should go to Rhino as well.
Comment 71 Igor Bukanov 2004-12-29 19:09:52 PST
(In reply to comment #63)
> In SpiderMonkey's E4X, you can extract a method using the function:: namespace,

It sems that CVS tip does not treat function:: specially if it starts expression
like in:

with (<xml>...</xml>) {
  function::name();
}

Should FunctionStmt be XML-ised as well?
Comment 72 Brendan Eich [:brendan] 2004-12-29 19:11:58 PST
Created attachment 169878 [details] [diff] [review]
further fixes for dmose's JS component re-registration case

1)  Propagate the TCF_HAS_DEFXMLNS flag from the parser to the constant folder,
so that the latter doesn't fold XML literals whose default namespace can't be
known till runtime.

2)  The xml_mark object-op must call the native mark object-op (js_Mark),
because XML objects are like native objects: they have scopes (almost always, a
share in the XML.prototype object's scope), and they have slots (almost always,
just the first four).

/be
Comment 73 Brendan Eich [:brendan] 2004-12-29 19:42:35 PST
Created attachment 169881 [details] [diff] [review]
patch for bug pointed out in comment 71

This is a jsparse.c patch generated on top of the patch at attachment 169878 [details] [diff] [review].
Thanks, Igor!

/be
Comment 74 Brendan Eich [:brendan] 2004-12-30 09:45:55 PST
Comment on attachment 169878 [details] [diff] [review]
further fixes for dmose's JS component re-registration case

This patch and the next one after it are checked into the cvs trunk now.

/be
Comment 75 Brendan Eich [:brendan] 2004-12-31 00:32:02 PST
Another erratum:

13.3.5.4 [[GetNamespace]]
         10.3.2.1 Step 6(c) NOTE does not say how a [prefix] containing "no
         value" (see http://www.w3.org/TR/xml-infoset 2.2.3) maps to
         x.[[Name]].[[Prefix]], but it is reasonable to assume it maps to the
         *undefined* prefix value mentioned various places in ECMA-357 as a
         distinct value from any string value, to mean "no prefix".
 
         10.3.2.1 Step 6(h)(i)(1) specifies:
  
         1. If the [local name] property of a is "xmlns"
             a. Map ns.prefix to the empty string
 
         Now 13.3.5.4 [[GetNamespace]] Step 3 includes this note:
 
         NOTE: implementations that preserve prefixes in qualified names may
         additionally constrain ns, such that ns.prefix == q.[[Prefix]]
 
         But "" != undefined, or *undefined* represented by null or any other
         non-void value that is distinct from any string when compared with
         == (note: it would be better if the spec used ===, since 0 == "" and
         0 == "0").
 
         Therefore, without further special-casing in [[GetNamespace]],
         10.2.1 "ToXMLString Applied to the XML Type" step 11, given XML
         of the form:
 
         <t xmlns="http://foo.com"/>
 
         will call [[GetNamespace]] on x.[[Name]] = (prefix=undefined,
         localName="t") with argument { (prefix="", uri="http://foo.com") }.
         According to the NOTE in 13.3.5.4 Step 3, the singleton namespace
         won't match the QName because "" != undefined, so [[GetNamespace]]
         will create a new Namespace (prefix=undefined, uri="http://foo.com").
 
         Then 10.2.1 Step 12 will generate an implementation-defined
         namespace prefix distinct from any prefix in { "", undefined }, sets
         namespace.prefix to that generated string, and unions namespace into
         namespaceDeclarations.  The effect on the example ToXMLString input
         is the string:
 
         '<pfx:t xmlns:pfx="http://foo.com" xmlns:pfx2="http://foo.com"/>'
 
         (pfx2 and pfx may be the same generated string, if the prefix auto-
         generation algorithm deterministically computes a variation on a part
         of the namespace's uri.)
 
Need more toXMLString tests!

/be
Comment 76 Brendan Eich [:brendan] 2005-01-01 09:36:10 PST
Here's an even subtler suboptimality in the spec:

10.2.1 "ToXMLString applied to the String type" Step 12
        Erratum: if both namespace.prefix and x.[[Name]].[[Prefix]] are
        undefined, we know that x was named using the default namespace
        (proof: see [[GetNamespace]] and the Namespace constructor called
        with two arguments).  So we ought not generate a new prefix here,
        when we can declare namespace as the default namespace in x.
 
        The fix is to change Step 12 to something like this:
 
        12. If (namespace.prefix == undefined),
            a.  If (x.[[Name]].[[Prefix]] == undefined),
                i.  Let namespace.prefix be the empty string
            b.  Else
                ii. Let namespace.prefix be an arbitrary implementation
                    defined namespace prefix, such that there is no ns2 in
                    (AncestorNamespaces U namespaceDeclarations) with
                    namespace.prefix == ns2.prefix
            c.  [what was 12(b) goes here]
 
        This helps descendants inherit the namespace instead of redundantly
        redeclaring it with generated prefixes in each descendant.
 

If I'm missing a good reason not to serialize default namespaces via default
namespace declarations (xmlns=...) via toXMLString, let me know.  With the code
version of this fix in SpiderMonkey's E4X, the testsuite still passes fine (it
fails to test toXMLString on a deep tree that uses only the default namespace).
 This case, from dmose, is even better:

default xml namespace = "urn:ietf:params:xml:ns:caldav";
var queryXml =
  <calendar-query>
    <calendar-query-result/>
      <filter>
        <icalcomp-filter name="VCALENDAR">
          <icalcomp-filter name="VEVENT">
          </icalcomp-filter>
        </icalcomp-filter>
      </filter>
  </calendar-query>;
 
print(queryXml.toXMLString());

Without the fix, you get:

<caldav:calendar-query xmlns:caldav="urn:ietf:params:xml:ns:caldav">
  <caldav:calendar-query-result xmlns:caldav="urn:ietf:params:xml:ns:caldav"/>
  <caldav:filter xmlns:caldav="urn:ietf:params:xml:ns:caldav">
    <caldav:icalcomp-filter xmlns:caldav="urn:ietf:params:xml:ns:caldav"
name="VCALENDAR">
      <caldav:icalcomp-filter xmlns:caldav="urn:ietf:params:xml:ns:caldav"
name="VEVENT"></caldav:icalcomp-filter>
    </caldav:icalcomp-filter>
  </caldav:filter>
</caldav:calendar-query>

Painful!  With the fix, you get:

<calendar-query xmlns="urn:ietf:params:xml:ns:caldav">
  <calendar-query-result/>
  <filter>
    <icalcomp-filter name="VCALENDAR">
      <icalcomp-filter name="VEVENT"></icalcomp-filter>
    </icalcomp-filter>
  </filter>
</calendar-query>

Nice!

/be
Comment 77 Brendan Eich [:brendan] 2005-01-10 18:36:11 PST
The proposed fix in comment 76 needs fixing:

        12. If (namespace.prefix == undefined),
            a.  If (x.[[Name]].[[Prefix]] == undefined),
                i.  Let namespace.prefix be the empty string
                ii. If there exists a member ns2 of namespaceDeclarations that
                    has the empty string as its prefix
                    1.  Remove ns2 from namespaceDeclarations
            b.  Else
                ii. Let namespace.prefix be an arbitrary implementation
                    defined namespace prefix, such that there is no ns2 in
                    (AncestorNamespaces U namespaceDeclarations) with
                    namespace.prefix == ns2.prefix
            c.  [what was 12(b) goes here]

This came up via bug 277779, which wants XML.prototype.setNamespace (at least;
what about setName as well?) to auto-magically add to the declared namespaces,
but not to duplicate default namespace declarations.

/be
Comment 78 Igor Bukanov 2005-01-13 07:54:44 PST
Errata for E4X 13.1.2.1 isXMLName(value)

The specs states: 

2. If q.localName does not match the production NCName, return false

but it does not define the meaning of "match". This is important since NCName, 
http://w3.org/TR/xml-names11/#NT-NCName, includes among other thinks characters
beyond 64K limit, http://www.w3.org/TR/xml11/#NT-NameStartChar :

NameStartChar ::=  ... [#x10000-#xEFFFF]

Since JS strings are just 2byte chunks, the specs should define the meaning of
"match". IMO a clean way to deal with this is to require to interpret JS strings
as UTF-16 encoding and to define what to do if the JS string is not a valid UTF-16. 

Comment 79 Mike Shaver (:shaver -- probably not reading bugmail closely) 2005-01-19 15:25:48 PST
Comment on attachment 166536 [details] [diff] [review]
better approach

rs=shaver
Comment 80 Brendan Eich [:brendan] 2005-01-22 19:59:55 PST
Latest thinking from ECMA TG1, or the subset working on E4X as it goes to ISO:
the [[InScopeNamespaces]] internal property should be mapped from the XML
Infoset's [in-scope namespaces], excluding the xml: prefix's namespace.  Not
mapped from the [namespace attributes] property, as ECMA-357 10.3.2.1 step
6(g-h) have it.

TG1, or at least the subset consisting of Jeff Dyer of Macromedia and me, also
wants to constrain [[InScopeNamespaces]] to include only *used* namespaces.

This changes a bunch of the code I wrote in SpiderMonkey, and it probably
changes the Rhino E4X implementation.  It means that you can always compute the
declared namespaces (returned by namespaceDeclarations, and serialized in the
output of toXMLString) for an element are always the set difference of that
element's in-scope namespaces minus the union of its ancestors' in-scope
namespaces.  So there's no need for the |declared| flag in JSXMLNamespace.

More in a bit.

/be
Comment 81 Brendan Eich [:brendan] 2005-02-12 21:46:54 PST
Need to finish this off for 1.8:

- Revise namespace inheritance so [[InScopeNamespaces]] matches [in-scope
namespaces] from XML-infoset (this bug).

- Implement XML(document) for two-way DOM <-> E4X data binding (bug 270553).

- Enable E4X by default for XUL contexts (bug not filed yet).

/be
Comment 82 Brendan Eich [:brendan] 2005-04-12 15:54:37 PDT
More ECMA-357 errata:

I claim ECMA-357 has a spec bug in 10.2.1 "ToXMLString Applied to the XML Type":

17. For each an in attrAndNamespaces
 a. Let s be the result of concatenating s and the space <SP> character
 b. If Type(an) is XML and an.[[Class]] == "attribute"
  i. Let ans be a copy of the result of calling [[GetNamspace]] on a.[[Name]]
     with argument AncestorNamespaces

The last line should read

     with argument AncestorNamespaces U namespaceDeclarations

as namespaceDeclarations (the xmlns= and xmlns:prefix= attributes in this
element) are in-scope for this element's attributes.

I also claim 13.4.4.35 XML.prototype.setName and 13.4.4.36
XML.prototype.setNamespace are incomplete.  They say absolutely nothing about
updating [[InScopeNamespaces]].  Other parts of the spec manually update this
set, but these methods assume that it magically keeps itself up-to-date with
respect to x.[[Name]].

Warning: you cannot use AddInScopeNamespace easily from 13.4.4.{35,36} to fix
this erratum.

/be
Comment 83 Bob Clary [:bc:] 2005-04-15 15:10:53 PDT
errata nit: spelling "initializer" in 12.1 example on page 58, while rest of
document uses "initialiser"
Comment 84 Martin Honnen 2005-04-17 06:20:14 PDT
Section 10.1.2 ToString Applied to the XMLList Type of the specification says
the following:

Given an XMLList object l, ToString performs the following steps:
!!!1. If l.hasSimpleContent() == true
  a. Let s be the empty string
  b. For i = 0 to l.[[Length]]-1,
    i. If x[i].[[Class]] &#8713; {"comment", "processing-instruction"}
      1. Let s be the result of concatenating s and ToString(l[i])
  c. Return s
!!!2. Else
  a. Return ToXMLString(x)

Where does the 'x' in 1.b.i and 2.a come from? I guess that should be
  i. If l[i].[[Class]] &#8713; {"comment", "processing-instruction"}
and
  a. Return ToXMLString(l)
Comment 85 Asa Dotzler [:asa] 2005-08-03 15:13:47 PDT
we've got what we're going to get, I believe, so pulling this off of the branch
blocking list.
Comment 86 Biju 2005-12-26 10:27:07 PST
I see bug 321547, bug 321549 in E4X implementation
Comment 87 Jesse Ruderman 2007-04-02 22:35:56 PDT
WHy is this bug still open rather than marked as fixed?
Comment 88 Brendan Eich [:brendan] 2007-04-02 22:53:53 PDT
Not because of bug 349263, that can wait. It must be due to bug 270553.

/be
Comment 89 Jeff Walden [:Waldo] (remove +bmo to email) 2007-04-06 18:40:28 PDT
Since we've tracked errata here so far...

The specification currently says that, during parsing of an XML literal or a string passed to XML(), if XML.ignoreWhitespace is true whitespace-only text nodes are removed from the resultant XML object; it does not say that leading and trailing whitespace in XML text nodes should be trimmed.  Per discussion in bug 369394, this is a mistake in the specification -- when XML.ignoreWhitespace is true, leading and trailing whitespace should be stripped from XML text nodes.
Comment 90 Brendan Eich [:brendan] 2007-07-22 22:24:21 PDT
Someone else should lead the charge here; I've got bigger fish to fry.

/be
Comment 91 Matthias Versen [:Matti] 2013-01-02 23:42:06 PST
E4X will be removed again from Spidermonkey (bug 788293)

Note You need to log in before you can comment on or make changes to this bug.