Last Comment Bug 270553 - E4x<->DOM
: E4x<->DOM
Status: RESOLVED WONTFIX
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: Trunk
: All All
: P1 enhancement with 21 votes (vote)
: mozilla1.8beta2
Assigned To: general
:
Mentors:
http://www.ecma-international.org/pub...
: 321544 466819 (view as bug list)
Depends on: 318473
Blocks: e4x 311524
  Show dependency treegraph
 
Reported: 2004-11-17 23:33 PST by Nigel McFarlane
Modified: 2013-04-04 08:14 PDT (History)
66 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
part 1: enable <script type="text/javascript;e4x=1"> support (10.63 KB, patch)
2004-11-29 19:14 PST, Brendan Eich [:brendan]
jst: review+
jst: superreview+
Details | Diff | Splinter Review
part 1, v2 (10.09 KB, patch)
2004-12-23 23:05 PST, Brendan Eich [:brendan]
jst: review+
jst: superreview+
Details | Diff | Splinter Review

Description Nigel McFarlane 2004-11-17 23:33:11 PST
Can't find a separate bug for E4X<->DOM integration to put feedback in.

When JSCALLS_DOCUMENT_OBSERVER or whatever comes to pass, can
we please have (for docs) some one-line source code comments
stating what E4X optional semantics are implemented and "when
or if" DOM 2 Nodes are deep-copied or wrapped to/from native E4X
XML objects?
Comment 1 Brendan Eich [:brendan] 2004-11-24 14:50:51 PST
Pleaes don't use a whole bug to ask for code comments.  I'm morphing this bug
into the bug to hook up SpiderMonkey's E4X and the Gecko DOM.

/be
Comment 2 Brendan Eich [:brendan] 2004-11-29 19:14:44 PST
Created attachment 167390 [details] [diff] [review]
part 1: enable <script type="text/javascript;e4x=1"> support

I also had to fix a few resolve hooks to avoid assuming id is either an int or
a string jsval.  This is not a backward-compatible change, which is yet another
reason why XML support has to be enabled with a JSOPTION_XML flag.  The case
that breaks the old code's assumption here is JS_DEFAULT_XML_NAMESPACE_ID,
which is JSVAL_VOID -- an int-tagged jsval for which !JSVAL_IS_INT(id).

The alternatives for handling the "default xml namespace" property all entailed
more code in the core engine and in global object implementations, so I think
this slightly-incompatible change is worth making now.	If there are any other
resolve hooks, or getters or setters, that need tweaking, we should find them
soon, but I didn't see 'em running a trivial E4X-in-HTML test.

/be
Comment 3 Brendan Eich [:brendan] 2004-11-29 19:19:50 PST
Comment on attachment 167390 [details] [diff] [review]
part 1: enable <script type="text/javascript;e4x=1"> support

Darin helped update my string-fu to the new millennium!

/be
Comment 4 Johnny Stenback (:jst, jst@mozilla.com) 2004-11-30 13:48:24 PST
Comment on attachment 167390 [details] [diff] [review]
part 1: enable <script type="text/javascript;e4x=1"> support

r+sr=jst
Comment 5 Christian :Biesinger (don't email me, ping me on IRC) 2004-12-06 16:59:28 PST
Comment on attachment 167390 [details] [diff] [review]
part 1: enable <script type="text/javascript;e4x=1"> support

content/base/src/nsScriptLoader.cpp
+      PRInt32 e4xIndex = params.Find("e4x=", PR_TRUE);
+      if (e4xIndex == 0 || (e4xIndex > 0 && params[e4xIndex-1] == ';')) {

note that http://www.faqs.org/rfcs/rfc2045.html 5.1 does allow spaces around
the ';'; while I can't seem to find that allowance in the BNF, the examples
show it.


Maybe
http://lxr.mozilla.org/seamonkey/source/netwerk/mime/public/nsIMIMEHeaderParam.
idl would be useful here?
Comment 6 Brendan Eich [:brendan] 2004-12-06 17:59:20 PST
Examples in RFC 2045 show one space after the semicolon, but BNF should be
normative.  Maybe there's a followup RFC that corrects the grammar?

If spaces are allowed in the real world (that is, if content-type and other
headers actually sent contain spaces around semicolons), then we should allow it
here in nsScriptLoader.cpp.  It should be ok making content depend on necko, eh?

/be
Comment 7 Nigel McFarlane 2004-12-06 18:24:47 PST
I don't have a view on that, but I offer this other titbit:

content/base/src/nsScriptLoader.cpp
+      PRInt32 e4xIndex = params.Find("e4x=", PR_TRUE);

Case-insensitivity for e4x= would be nice, since people will have to
type it. It's also the default in RFC 2045 for media type parameters.

- N
Comment 8 Brendan Eich [:brendan] 2004-12-07 07:30:11 PST
Re: comment 7: that params.Find call passes PR_TRUE for aIgnoreCase, so it is
matching case-insensitively already.  Please read the source, and consider the
nearby version= matching that was done long aga in the same way.

/be
Comment 9 Christian :Biesinger (don't email me, ping me on IRC) 2004-12-07 15:10:34 PST
(In reply to comment #6)
> Examples in RFC 2045 show one space after the semicolon, but BNF should be
> normative.  Maybe there's a followup RFC that corrects the grammar?

Hm.. can't seem to find one...

> If spaces are allowed in the real world (that is, if content-type and other
> headers actually sent contain spaces around semicolons), then we should allow it
> here in nsScriptLoader.cpp.

They certainly do; the charset attribute in an http header has almost always a
space after the ';'. same for the "filename"-attribute of the
Content-Disposition header.

>  It should be ok making content depend on necko, eh?

especially as it already depends on necko ;)
http://lxr.mozilla.org/seamonkey/source/content/base/src/Makefile.in#58
Comment 10 Peter Van der Beken [:peterv] 2004-12-08 06:56:55 PST
(In reply to comment #5)
> note that http://www.faqs.org/rfcs/rfc2045.html 5.1 does allow spaces around
> the ';'; while I can't seem to find that allowance in the BNF, the examples
> show it.

I may be wrong, but I think that's covered in RFC 822
(http://www.faqs.org/rfcs/rfc822.html), point 3.1.4 which allows whitespace
between lexical tokens in structured fields.
Comment 11 Brendan Eich [:brendan] 2004-12-23 20:30:17 PST
So one awkwardness is that nsScriptLoader.cpp uses content interfaces such as
nsIScriptElement::GetScriptType, which return wide strings, while the Necko MIME
header parser uses narrow strings.  CopyUTF16ToUTF8 seems like the only way to
make the one pipe fit into the other....

/be
Comment 12 Brendan Eich [:brendan] 2004-12-23 22:37:33 PST
biesi wrote:

> especially as it already depends on necko ;)
> http://lxr.mozilla.org/seamonkey/source/content/base/src/Makefile.in#58

But nsIMIMEHeaderParam.h is installed under dist/include/mimetype, not
.../necko. Easy to fix, just wanted to make sure there's no problem adding
mimetype to the REQUIRES line.

/be

Comment 13 Brendan Eich [:brendan] 2004-12-23 23:05:39 PST
Created attachment 169496 [details] [diff] [review]
part 1, v2

Thanks for the tip, biesi -- code is better: smaller, more general.

/be
Comment 14 Johnny Stenback (:jst, jst@mozilla.com) 2004-12-24 02:45:25 PST
Comment on attachment 169496 [details] [diff] [review]
part 1, v2

+  if (aRequest->mHasE4XOption) {
+    ::JS_SetOptions(cx, JS_GetOptions(cx) | JSOPTION_XML);

If you want :: before JS_SetOptions() you should also add it before
JS_GetOptions().

r+sr=jst
Comment 15 Brendan Eich [:brendan] 2004-12-24 08:25:51 PST
Is this :: purity worth it?  When in Rome (even though the old buildings are
crumbling...).

/be
Comment 16 Christian :Biesinger (don't email me, ping me on IRC) 2004-12-24 18:03:35 PST
Comment on attachment 169496 [details] [diff] [review]
part 1, v2

+    nsCAutoString typeAndParams;
+    CopyUTF16toUTF8(type, typeAndParams);

maybe:
  NS_ConvertUTF16toUTF8 typeAndParams(type);
? just a random thought.
Comment 17 Brendan Eich [:brendan] 2004-12-25 08:38:26 PST
biesi: that actually makes more code (in my build, 73 bytes more of code), and
it always uses a heap allocation.  We want an auto-string here because the type
is always short enough to fit in one.

/be
Comment 18 Christian :Biesinger (don't email me, ping me on IRC) 2004-12-25 13:04:58 PST
wow, I'd have expected less code... ok, in that case your code is better. but as
for your heap allocation comment,
http://lxr.mozilla.org/seamonkey/source/xpcom/string/public/nsString.h#141
(hm... didn't know this class was implemented inline)
Comment 19 Brendan Eich [:brendan] 2004-12-25 14:52:01 PST
biesi: oops, I was looking at nsStringAPI.h:

xpcom/string/public/nsStringAPI.h:class NS_ConvertUTF16toUTF8 : public nsCString

How's that work with the nsString.h class def, again?

/be
Comment 20 Christian :Biesinger (don't email me, ping me on IRC) 2004-12-25 15:05:42 PST
this is starting to get offtopic... but I think that is the issue of bug 243109.
(doesn't nsStringAPI.h define its own nsCString too?)
Comment 21 Brendan Eich [:brendan] 2004-12-26 19:11:58 PST
biesi and I were chatting on IRC and his knowledge of nsString.h and how things
inline led to the discovery that the code size increase I noted was due to
--enable-debug, which enables -fno-inline.  With just -O2, there is no size
difference between NS_ConvertUTF16toUTF8 and the nsCAutoString/CopyUTF16toUTF8
pair -- so I'm using the one-line solution biesi proposed in the first place.

(More off-topic grumbling: why did the old global function names for NS_Convert*
turn into typenames, which should start with "ns" and no _ ?)

/be
Comment 22 Darin Fisher 2005-01-04 23:53:04 PST
> (More off-topic grumbling: why did the old global function names for NS_Convert*
> turn into typenames, which should start with "ns" and no _ ?)

been like that for ages.  ask scc maybe? ;-)
Comment 23 Eric Kidd 2005-09-30 12:28:38 PDT
In http://developer.mozilla.org/presentations/xtech2005/e4x/ , it says:

  # Create an XML wrapper for a DOM object:
  var doc = new XML(document);
  # Two-way data binding between DOM and E4X data models

In Firefox 1.5 Beta 1, this still dies:

  var d = new XML(document);
  Error: can't convert [object XMLDocument] to XML
Comment 24 Yuh-Ruey Chen 2005-10-13 20:35:22 PDT
I don't think that would be necessary once this bug is fixed. It should be
transparent, e.g.

document.body..div
Comment 25 Brendan Eich [:brendan] 2005-10-13 20:38:35 PDT
(In reply to comment #24)
> I don't think that would be necessary once this bug is fixed. It should be
> transparent, e.g.
> 
> document.body..div

No, you need to create an E4X wrapper for the document object, per ECMA-357.

/be
Comment 26 Boris Zbarsky [:bz] 2005-12-01 11:29:29 PST
The scriptloader patch for this bug caused bug 318473
Comment 27 Christian :Biesinger (don't email me, ping me on IRC) 2005-12-26 08:33:44 PST
*** Bug 321544 has been marked as a duplicate of this bug. ***
Comment 28 Biju 2005-12-26 10:13:14 PST
Firefox breaks at following 2 cases even though they are valid XML, 
is this same bug? 

==== case 1
doc =
<?xml version="1.0"?>
<!DOCTYPE doc [
<!ATTLIST d id ID #IMPLIED>
]>
<doc>
  <d id="id3">Three</d>
</doc>;

==== case 2

str='\
<?xml version="1.0"?>\n\
<!DOCTYPE doc [\n\
<!ATTLIST d id ID #IMPLIED>\n\
]>\n\
<doc>\n\
  <d id="id3">Three</d>\n\
</doc>';

doc = new XML(str);
Comment 29 Hixie (not reading bugmail) 2005-12-26 10:45:32 PST
No, that's a different bug (if it's a bug at all, which I don't think it is).
Comment 30 Biju 2005-12-26 12:01:22 PST
(In reply to comment #29)
> No, that's a different bug 

created bug 321564
Comment 31 Biju 2005-12-26 12:06:44 PST
For E4x<->DOM 
there should be a way to handle &nbsp; present in DOM other wise we are going to miss content/layout when do E4x<->DOM 
Comment 32 Hixie (not reading bugmail) 2005-12-26 12:11:07 PST
What's wrong with using &#A0; ?
Comment 33 Biju 2005-12-26 20:39:09 PST
(In reply to comment #32)
> What's wrong with using &#A0; ?
Tanx.. &#xA0; will do
Comment 34 Myk Melez [:myk] [@mykmelez] 2006-03-17 16:38:41 PST
> No, you need to create an E4X wrapper for the document object, per ECMA-357.

devmo's "New in JavaScript 1.6" page ( http://developer.mozilla.org/en/docs/New_in_JavaScript_1.6 ) says "We're going to continue to improve our E4X support, including adding transparent integration with the existing DOM."  Is that not correct?

Comment 35 Brendan Eich [:brendan] 2006-03-17 19:12:02 PST
(In reply to comment #34)
> > No, you need to create an E4X wrapper for the document object, per ECMA-357.
> 
> devmo's "New in JavaScript 1.6" page (
> http://developer.mozilla.org/en/docs/New_in_JavaScript_1.6 ) says "We're going
> to continue to improve our E4X support, including adding transparent
> integration with the existing DOM."  Is that not correct?

I don't know who wrote that, but JS1.6 is done, and this bug is not fixed in it.

/be
Comment 36 Mike Shaver (:shaver -- probably not reading bugmail closely) 2006-03-19 08:19:09 PST
(In reply to comment #35)
> (In reply to comment #34)
> > > No, you need to create an E4X wrapper for the document object, per ECMA-357.
> > 
> > devmo's "New in JavaScript 1.6" page (
> > http://developer.mozilla.org/en/docs/New_in_JavaScript_1.6 ) says "We're going
> > to continue to improve our E4X support, including adding transparent
> > integration with the existing DOM."  Is that not correct?
> 
> I don't know who wrote that, but JS1.6 is done, and this bug is not fixed in
> it.

Deb wrote it, based on a conversation with you and me, and it doesn't say that the bug is fixed in JS1.6.  The full sentence is:

"We're going to continue to improve our E4X support, including adding transparent integration with the existing DOM, but developers who are building XML-based web applications can benefit from E4X support in Firefox 1.5."

So it'll get better in the future, but you can already take advantage of E4X in Fx1.5/JS1.6.

But the question at hand is: will an author need to explicitly wrap DOM objects with |new XML|, or will the support be transparent?
Comment 37 Brendan Eich [:brendan] 2006-03-19 14:29:13 PST
Not sure how to make things transparent without breaking backward compat: given DOM object d, would you make E4X methods available on d via its prototype chain? That would break backward compat.

/be
Comment 38 Biju 2006-10-31 23:57:42 PST
(In reply to comment #37)
> Not sure how to make things transparent without breaking backward compat: given
> DOM object d, would you make E4X methods available on d via its prototype
> chain? That would break backward compact.

Till we have a solution, cant we have an xml property for a DOMNode which exposes it as an E4X object, and can be manipulated to change the DOM object. 
ie, like 
1. document.mozXML ;
2. document.mozXML.head.title = "a new title text";
3. document.body.mozXML.p = <p>New body content</p>;
4. document.getElementById("sub_txt").mozXML.@value = "Hello world";

Comment 39 Jesse Thompson 2007-04-07 12:22:29 PDT
I would recommend at least supporting the E4X wrapping mechanism as specified in ECMA-357 first. That gives the E4X developers a painless way to manipulate the DOM without complicating backwards compatibility. Once the ECMA recommendation is sated, then support for the transparent DOM access will be that much easier, and that much less necessary in the E4X developer adoption path.

Per spec you can also rest assured that XML(document) won't break transitory scripts once document gains a native XML interface.
Comment 40 David Caldwell 2007-04-26 13:01:41 PDT
Rhino guy dropping in, as I'm grappling with this issue on that side.  Some of these semantics simply can't be transparent (at least not completely), due to underlying differences in the object models.  For example, a DOM node can only be present at one place in the tree, but an E4X node can be copied into the same tree multiple times.  See, for example, the discussion surrounding bug 312692.  (Rhino also has semi-related bug 369606.)

I'm open to suggestions on how to model this on the Rhino side.  Maybe if I mull a little more on this side I'll make suggestions of my own.  (Off the top of my head, my first thought would be to revisit the decisions we made in bug 312692 and maybe decide to make the semantics match.)
Comment 41 David Caldwell 2007-05-03 08:26:53 PDT
To expand on my previous comment ... what should the output of the following test case be?  (Forgive my pidgin DOM, I am not an expert here, just trying to illustrate.  If I've made any errors feel free to point them out; just trying to think this through here.)

Assume content of document is <foo><bar/></foo>.  Assume appropriate definition of print() from enclosing shell.
* * *
//  Create object that should remain synchronized with DOM object; see ECMA 357 //  sections 13.4.1 and 10.3.2 
var e4x = XML(document);
var baz = <baz/>;
e4x.bar.appendChild(baz);
e4x.appendChild(baz);
baz.@attribute = "value";
print(e4x..baz.length());
print(e4x..baz.(@attribute == "value").length());
print(document.getElementsByTagName("baz").length);
var matching = 0;
for (var i=0; i<document.getElementsByTagName("baz").length; i++) {
  if (document.getElementsByTagName("baz")[i].getAttribute("attribute") == "value")
    matching++;
}
print(matching);
* * *
There are a couple of possibilities here. 

The most obvious is to synchronize the DOM and E4X object models and override the decision we made in bug 312692.  This would make the previous test print:
1
1
1
1

The next most obvious (more painful) one is to have each E4X object potentially be a reference to many DOM nodes.  Then a change to a particular E4X object would be propagated to all DOM nodes to which it pertains.  Then, the previous test would print:

2
2
2
2

The next solution I can think of would have an E4X object have the ability to point to the last DOM object to which it pertained (similar to how parent() is supported in E4X currently; see bug 312692).  Then, I think, we'd see:

2
1 (? -- assuming E4X just wrapped DOM and didn't maintain its own data model)
2
1

Alas, these are not the only three solutions I can think of, just the ones that seem to have the most precedent to date, and I have droned on enough here.

I vote for them in the order presented.  Thoughts?
Comment 42 Jesse Thompson 2007-05-12 17:01:46 PDT
I can report that for the needs of my projects any of the solution vectors mentioned would suffice. Solution #2 would be the sweetest, and compliance with be's ideas in bug 312692 would be quite elegant.. but I can also appreciate the implementational concerns. I would rather have it available tomorrow then ideal half past never. 

The problem appears to be a philosophical difference between e4x and dom. If dom supported the notion of one dom object existing in multiple parts of the same tree, we would just be glomming off of that notion. 

Another solution #2 roadblock I see is this: 

/* document is a dom version of <foo><bar /></foo> */
var e4xA = new XML(document);
var e4xB = new XML(document);
var baz = <baz>;

e4xA.appendChild(baz);
e4xB.bar.appendChild(baz);
e4xA.baz.@x = 'y';
print(e4xB.bar.baz);

Since we are referencing e4x objects by default, solution 2 demands that wrapping two such objects around a common dom should share references.

I have considered generating a private singleton facade for a dom document that gets wrapped. This facade would be accessed by e4x objects that wish to wrap the dom or it's children. The facade would keep track of which dom nodes really represent the same object from e4x's perspective. But then the falldown is that direct manipulation of the dom would kill the facade's effectiveness.

So, I cannot suggest anything useful on my own besides siding with option #1. anyone else?

- - Jesse
Comment 43 Brendan Eich [:brendan] 2007-07-22 22:22:42 PDT
I'm not going to have time for this for the foreseeable future. I'm interested in a better E4X in light of JS2/ES4's type system, but that can be developed first via other means than this bug.

/be
Comment 44 HE Shi-Jun 2007-09-09 22:18:13 PDT
(In reply to comment #41)
> The next most obvious (more painful) one is to have each E4X object potentially
> be a reference to many DOM nodes.  Then a change to a particular E4X object
> would be propagated to all DOM nodes to which it pertains.  Then, the previous
> test would print:
> 
> 2
> 2
> 2
> 2
> 

What will happen if I change the DOM nodes then? For example:

document.getElementsByTagName("baz")[0].setAttribute("attribute", "another value")

baz.@attribute == ? ("value" or "another value")
document.getElementsByTagName("baz")[1].getAttribute("attribute") == ? ("value" or "another value")
Comment 45 David Caldwell 2007-09-11 09:25:16 PDT
Agreed that this is a problem.  There's an unambiguous answer to the second one -- "value" since that DOM node has not been changed.  In the proposed solution #2, the E4X node would be a facade for all the attached DOM nodes but the DOM would be the model that would take precedence.

However for the baz.@attribute case I can't come up with a simple rule that makes any sense.  Hence my preference for option 1. :)
Comment 46 Peter Van der Beken [:peterv] 2008-12-01 04:43:57 PST
*** Bug 466819 has been marked as a duplicate of this bug. ***
Comment 47 Kris Maglione [:kmag] 2012-08-31 13:26:07 PDT
Bah. E4X is dead.

Note You need to log in before you can comment on or make changes to this bug.