Closed Bug 270553 Opened 20 years ago Closed 12 years ago

E4x<->DOM

Categories

(Core :: JavaScript Engine, enhancement, P1)

enhancement

Tracking

()

RESOLVED WONTFIX
mozilla1.8beta2

People

(Reporter: nrm, Unassigned)

References

()

Details

Attachments

(1 file, 1 obsolete file)

Can't find a separate bug for E4X<->DOM integration to put feedback in.

When JSCALLS_DOCUMENT_OBSERVER or whatever comes to pass, can
we please have (for docs) some one-line source code comments
stating what E4X optional semantics are implemented and "when
or if" DOM 2 Nodes are deep-copied or wrapped to/from native E4X
XML objects?
Pleaes don't use a whole bug to ask for code comments.  I'm morphing this bug
into the bug to hook up SpiderMonkey's E4X and the Gecko DOM.

/be
Summary: code comments on E4x<->DOM → E4x<->DOM
Blocks: e4x
I also had to fix a few resolve hooks to avoid assuming id is either an int or
a string jsval.  This is not a backward-compatible change, which is yet another
reason why XML support has to be enabled with a JSOPTION_XML flag.  The case
that breaks the old code's assumption here is JS_DEFAULT_XML_NAMESPACE_ID,
which is JSVAL_VOID -- an int-tagged jsval for which !JSVAL_IS_INT(id).

The alternatives for handling the "default xml namespace" property all entailed
more code in the core engine and in global object implementations, so I think
this slightly-incompatible change is worth making now.	If there are any other
resolve hooks, or getters or setters, that need tweaking, we should find them
soon, but I didn't see 'em running a trivial E4X-in-HTML test.

/be
Comment on attachment 167390 [details] [diff] [review]
part 1: enable <script type="text/javascript;e4x=1"> support

Darin helped update my string-fu to the new millennium!

/be
Attachment #167390 - Flags: superreview?(jst)
Attachment #167390 - Flags: review?(jst)
Attachment #167390 - Attachment description: part 1: enable <script type="text/javascript;e4x-1"> support → part 1: enable <script type="text/javascript;e4x=1"> support
Comment on attachment 167390 [details] [diff] [review]
part 1: enable <script type="text/javascript;e4x=1"> support

r+sr=jst
Attachment #167390 - Flags: superreview?(jst)
Attachment #167390 - Flags: superreview+
Attachment #167390 - Flags: review?(jst)
Attachment #167390 - Flags: review+
Comment on attachment 167390 [details] [diff] [review]
part 1: enable <script type="text/javascript;e4x=1"> support

content/base/src/nsScriptLoader.cpp
+      PRInt32 e4xIndex = params.Find("e4x=", PR_TRUE);
+      if (e4xIndex == 0 || (e4xIndex > 0 && params[e4xIndex-1] == ';')) {

note that http://www.faqs.org/rfcs/rfc2045.html 5.1 does allow spaces around
the ';'; while I can't seem to find that allowance in the BNF, the examples
show it.


Maybe
http://lxr.mozilla.org/seamonkey/source/netwerk/mime/public/nsIMIMEHeaderParam.
idl would be useful here?
Examples in RFC 2045 show one space after the semicolon, but BNF should be
normative.  Maybe there's a followup RFC that corrects the grammar?

If spaces are allowed in the real world (that is, if content-type and other
headers actually sent contain spaces around semicolons), then we should allow it
here in nsScriptLoader.cpp.  It should be ok making content depend on necko, eh?

/be
Status: NEW → ASSIGNED
Priority: -- → P1
Target Milestone: --- → mozilla1.8alpha6
I don't have a view on that, but I offer this other titbit:

content/base/src/nsScriptLoader.cpp
+      PRInt32 e4xIndex = params.Find("e4x=", PR_TRUE);

Case-insensitivity for e4x= would be nice, since people will have to
type it. It's also the default in RFC 2045 for media type parameters.

- N
Priority: P1 → --
Target Milestone: mozilla1.8alpha6 → ---
Re: comment 7: that params.Find call passes PR_TRUE for aIgnoreCase, so it is
matching case-insensitively already.  Please read the source, and consider the
nearby version= matching that was done long aga in the same way.

/be
Priority: -- → P1
Target Milestone: --- → mozilla1.8alpha6
(In reply to comment #6)
> Examples in RFC 2045 show one space after the semicolon, but BNF should be
> normative.  Maybe there's a followup RFC that corrects the grammar?

Hm.. can't seem to find one...

> If spaces are allowed in the real world (that is, if content-type and other
> headers actually sent contain spaces around semicolons), then we should allow it
> here in nsScriptLoader.cpp.

They certainly do; the charset attribute in an http header has almost always a
space after the ';'. same for the "filename"-attribute of the
Content-Disposition header.

>  It should be ok making content depend on necko, eh?

especially as it already depends on necko ;)
http://lxr.mozilla.org/seamonkey/source/content/base/src/Makefile.in#58
(In reply to comment #5)
> note that http://www.faqs.org/rfcs/rfc2045.html 5.1 does allow spaces around
> the ';'; while I can't seem to find that allowance in the BNF, the examples
> show it.

I may be wrong, but I think that's covered in RFC 822
(http://www.faqs.org/rfcs/rfc822.html), point 3.1.4 which allows whitespace
between lexical tokens in structured fields.
So one awkwardness is that nsScriptLoader.cpp uses content interfaces such as
nsIScriptElement::GetScriptType, which return wide strings, while the Necko MIME
header parser uses narrow strings.  CopyUTF16ToUTF8 seems like the only way to
make the one pipe fit into the other....

/be
biesi wrote:

> especially as it already depends on necko ;)
> http://lxr.mozilla.org/seamonkey/source/content/base/src/Makefile.in#58

But nsIMIMEHeaderParam.h is installed under dist/include/mimetype, not
.../necko. Easy to fix, just wanted to make sure there's no problem adding
mimetype to the REQUIRES line.

/be

Attached patch part 1, v2Splinter Review
Thanks for the tip, biesi -- code is better: smaller, more general.

/be
Attachment #167390 - Attachment is obsolete: true
Attachment #169496 - Flags: superreview?(jst)
Attachment #169496 - Flags: review?(jst)
Comment on attachment 169496 [details] [diff] [review]
part 1, v2

+  if (aRequest->mHasE4XOption) {
+    ::JS_SetOptions(cx, JS_GetOptions(cx) | JSOPTION_XML);

If you want :: before JS_SetOptions() you should also add it before
JS_GetOptions().

r+sr=jst
Attachment #169496 - Flags: superreview?(jst)
Attachment #169496 - Flags: superreview+
Attachment #169496 - Flags: review?(jst)
Attachment #169496 - Flags: review+
Is this :: purity worth it?  When in Rome (even though the old buildings are
crumbling...).

/be
Comment on attachment 169496 [details] [diff] [review]
part 1, v2

+    nsCAutoString typeAndParams;
+    CopyUTF16toUTF8(type, typeAndParams);

maybe:
  NS_ConvertUTF16toUTF8 typeAndParams(type);
? just a random thought.
biesi: that actually makes more code (in my build, 73 bytes more of code), and
it always uses a heap allocation.  We want an auto-string here because the type
is always short enough to fit in one.

/be
wow, I'd have expected less code... ok, in that case your code is better. but as
for your heap allocation comment,
http://lxr.mozilla.org/seamonkey/source/xpcom/string/public/nsString.h#141
(hm... didn't know this class was implemented inline)
biesi: oops, I was looking at nsStringAPI.h:

xpcom/string/public/nsStringAPI.h:class NS_ConvertUTF16toUTF8 : public nsCString

How's that work with the nsString.h class def, again?

/be
this is starting to get offtopic... but I think that is the issue of bug 243109.
(doesn't nsStringAPI.h define its own nsCString too?)
biesi and I were chatting on IRC and his knowledge of nsString.h and how things
inline led to the discovery that the code size increase I noted was due to
--enable-debug, which enables -fno-inline.  With just -O2, there is no size
difference between NS_ConvertUTF16toUTF8 and the nsCAutoString/CopyUTF16toUTF8
pair -- so I'm using the one-line solution biesi proposed in the first place.

(More off-topic grumbling: why did the old global function names for NS_Convert*
turn into typenames, which should start with "ns" and no _ ?)

/be
> (More off-topic grumbling: why did the old global function names for NS_Convert*
> turn into typenames, which should start with "ns" and no _ ?)

been like that for ages.  ask scc maybe? ;-)
Target Milestone: mozilla1.8alpha6 → mozilla1.8beta2
In http://developer.mozilla.org/presentations/xtech2005/e4x/ , it says:

  # Create an XML wrapper for a DOM object:
  var doc = new XML(document);
  # Two-way data binding between DOM and E4X data models

In Firefox 1.5 Beta 1, this still dies:

  var d = new XML(document);
  Error: can't convert [object XMLDocument] to XML
Blocks: 311524
I don't think that would be necessary once this bug is fixed. It should be
transparent, e.g.

document.body..div
(In reply to comment #24)
> I don't think that would be necessary once this bug is fixed. It should be
> transparent, e.g.
> 
> document.body..div

No, you need to create an E4X wrapper for the document object, per ECMA-357.

/be
The scriptloader patch for this bug caused bug 318473
Depends on: 318473
*** Bug 321544 has been marked as a duplicate of this bug. ***
Firefox breaks at following 2 cases even though they are valid XML, 
is this same bug? 

==== case 1
doc =
<?xml version="1.0"?>
<!DOCTYPE doc [
<!ATTLIST d id ID #IMPLIED>
]>
<doc>
  <d id="id3">Three</d>
</doc>;

==== case 2

str='\
<?xml version="1.0"?>\n\
<!DOCTYPE doc [\n\
<!ATTLIST d id ID #IMPLIED>\n\
]>\n\
<doc>\n\
  <d id="id3">Three</d>\n\
</doc>';

doc = new XML(str);
No, that's a different bug (if it's a bug at all, which I don't think it is).
(In reply to comment #29)
> No, that's a different bug 

created bug 321564
For E4x<->DOM 
there should be a way to handle &nbsp; present in DOM other wise we are going to miss content/layout when do E4x<->DOM 
What's wrong with using &#A0; ?
(In reply to comment #32)
> What's wrong with using &#A0; ?
Tanx.. &#xA0; will do
> No, you need to create an E4X wrapper for the document object, per ECMA-357.

devmo's "New in JavaScript 1.6" page ( http://developer.mozilla.org/en/docs/New_in_JavaScript_1.6 ) says "We're going to continue to improve our E4X support, including adding transparent integration with the existing DOM."  Is that not correct?

(In reply to comment #34)
> > No, you need to create an E4X wrapper for the document object, per ECMA-357.
> 
> devmo's "New in JavaScript 1.6" page (
> http://developer.mozilla.org/en/docs/New_in_JavaScript_1.6 ) says "We're going
> to continue to improve our E4X support, including adding transparent
> integration with the existing DOM."  Is that not correct?

I don't know who wrote that, but JS1.6 is done, and this bug is not fixed in it.

/be
(In reply to comment #35)
> (In reply to comment #34)
> > > No, you need to create an E4X wrapper for the document object, per ECMA-357.
> > 
> > devmo's "New in JavaScript 1.6" page (
> > http://developer.mozilla.org/en/docs/New_in_JavaScript_1.6 ) says "We're going
> > to continue to improve our E4X support, including adding transparent
> > integration with the existing DOM."  Is that not correct?
> 
> I don't know who wrote that, but JS1.6 is done, and this bug is not fixed in
> it.

Deb wrote it, based on a conversation with you and me, and it doesn't say that the bug is fixed in JS1.6.  The full sentence is:

"We're going to continue to improve our E4X support, including adding transparent integration with the existing DOM, but developers who are building XML-based web applications can benefit from E4X support in Firefox 1.5."

So it'll get better in the future, but you can already take advantage of E4X in Fx1.5/JS1.6.

But the question at hand is: will an author need to explicitly wrap DOM objects with |new XML|, or will the support be transparent?
Not sure how to make things transparent without breaking backward compat: given DOM object d, would you make E4X methods available on d via its prototype chain? That would break backward compat.

/be
(In reply to comment #37)
> Not sure how to make things transparent without breaking backward compat: given
> DOM object d, would you make E4X methods available on d via its prototype
> chain? That would break backward compact.

Till we have a solution, cant we have an xml property for a DOMNode which exposes it as an E4X object, and can be manipulated to change the DOM object. 
ie, like 
1. document.mozXML ;
2. document.mozXML.head.title = "a new title text";
3. document.body.mozXML.p = <p>New body content</p>;
4. document.getElementById("sub_txt").mozXML.@value = "Hello world";

QA Contact: pschwartau → general
I would recommend at least supporting the E4X wrapping mechanism as specified in ECMA-357 first. That gives the E4X developers a painless way to manipulate the DOM without complicating backwards compatibility. Once the ECMA recommendation is sated, then support for the transparent DOM access will be that much easier, and that much less necessary in the E4X developer adoption path.

Per spec you can also rest assured that XML(document) won't break transitory scripts once document gains a native XML interface.
Rhino guy dropping in, as I'm grappling with this issue on that side.  Some of these semantics simply can't be transparent (at least not completely), due to underlying differences in the object models.  For example, a DOM node can only be present at one place in the tree, but an E4X node can be copied into the same tree multiple times.  See, for example, the discussion surrounding bug 312692.  (Rhino also has semi-related bug 369606.)

I'm open to suggestions on how to model this on the Rhino side.  Maybe if I mull a little more on this side I'll make suggestions of my own.  (Off the top of my head, my first thought would be to revisit the decisions we made in bug 312692 and maybe decide to make the semantics match.)
To expand on my previous comment ... what should the output of the following test case be?  (Forgive my pidgin DOM, I am not an expert here, just trying to illustrate.  If I've made any errors feel free to point them out; just trying to think this through here.)

Assume content of document is <foo><bar/></foo>.  Assume appropriate definition of print() from enclosing shell.
* * *
//  Create object that should remain synchronized with DOM object; see ECMA 357 //  sections 13.4.1 and 10.3.2 
var e4x = XML(document);
var baz = <baz/>;
e4x.bar.appendChild(baz);
e4x.appendChild(baz);
baz.@attribute = "value";
print(e4x..baz.length());
print(e4x..baz.(@attribute == "value").length());
print(document.getElementsByTagName("baz").length);
var matching = 0;
for (var i=0; i<document.getElementsByTagName("baz").length; i++) {
  if (document.getElementsByTagName("baz")[i].getAttribute("attribute") == "value")
    matching++;
}
print(matching);
* * *
There are a couple of possibilities here. 

The most obvious is to synchronize the DOM and E4X object models and override the decision we made in bug 312692.  This would make the previous test print:
1
1
1
1

The next most obvious (more painful) one is to have each E4X object potentially be a reference to many DOM nodes.  Then a change to a particular E4X object would be propagated to all DOM nodes to which it pertains.  Then, the previous test would print:

2
2
2
2

The next solution I can think of would have an E4X object have the ability to point to the last DOM object to which it pertained (similar to how parent() is supported in E4X currently; see bug 312692).  Then, I think, we'd see:

2
1 (? -- assuming E4X just wrapped DOM and didn't maintain its own data model)
2
1

Alas, these are not the only three solutions I can think of, just the ones that seem to have the most precedent to date, and I have droned on enough here.

I vote for them in the order presented.  Thoughts?
I can report that for the needs of my projects any of the solution vectors mentioned would suffice. Solution #2 would be the sweetest, and compliance with be's ideas in bug 312692 would be quite elegant.. but I can also appreciate the implementational concerns. I would rather have it available tomorrow then ideal half past never. 

The problem appears to be a philosophical difference between e4x and dom. If dom supported the notion of one dom object existing in multiple parts of the same tree, we would just be glomming off of that notion. 

Another solution #2 roadblock I see is this: 

/* document is a dom version of <foo><bar /></foo> */
var e4xA = new XML(document);
var e4xB = new XML(document);
var baz = <baz>;

e4xA.appendChild(baz);
e4xB.bar.appendChild(baz);
e4xA.baz.@x = 'y';
print(e4xB.bar.baz);

Since we are referencing e4x objects by default, solution 2 demands that wrapping two such objects around a common dom should share references.

I have considered generating a private singleton facade for a dom document that gets wrapped. This facade would be accessed by e4x objects that wish to wrap the dom or it's children. The facade would keep track of which dom nodes really represent the same object from e4x's perspective. But then the falldown is that direct manipulation of the dom would kill the facade's effectiveness.

So, I cannot suggest anything useful on my own besides siding with option #1. anyone else?

- - Jesse
I'm not going to have time for this for the foreseeable future. I'm interested in a better E4X in light of JS2/ES4's type system, but that can be developed first via other means than this bug.

/be
Assignee: brendan → general
Status: ASSIGNED → NEW
(In reply to comment #41)
> The next most obvious (more painful) one is to have each E4X object potentially
> be a reference to many DOM nodes.  Then a change to a particular E4X object
> would be propagated to all DOM nodes to which it pertains.  Then, the previous
> test would print:
> 
> 2
> 2
> 2
> 2
> 

What will happen if I change the DOM nodes then? For example:

document.getElementsByTagName("baz")[0].setAttribute("attribute", "another value")

baz.@attribute == ? ("value" or "another value")
document.getElementsByTagName("baz")[1].getAttribute("attribute") == ? ("value" or "another value")
Agreed that this is a problem.  There's an unambiguous answer to the second one -- "value" since that DOM node has not been changed.  In the proposed solution #2, the E4X node would be a facade for all the attached DOM nodes but the DOM would be the model that would take precedence.

However for the baz.@attribute case I can't come up with a simple rule that makes any sense.  Hence my preference for option 1. :)
Bah. E4X is dead.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.