Last Comment Bug 336551 - E4X: Implement ability to process new XML("<?xml...?> ...")
: E4X: Implement ability to process new XML("<?xml...?> ...")
Status: RESOLVED WONTFIX
: helpwanted
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: Trunk
: x86 All
: -- enhancement with 14 votes (vote)
: ---
Assigned To: general
:
: Jason Orendorff [:jorendorff]
Mentors:
: 351371 (view as bug list)
Depends on: 375250
Blocks: e4x
  Show dependency treegraph
 
Reported: 2006-05-03 22:21 PDT by Biju
Modified: 2012-08-06 14:52 PDT (History)
39 users (show)
dsicore: blocking1.9.2-
dsicore: wanted1.9.2+
sayrer: blocking1.9.1-
sayrer: wanted1.9.1+
mtschrep: blocking1.9-
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments

Description Biju 2006-05-03 22:21:43 PDT
E4X: Implement ability to process new XML("<?xml...?> ...")

see: bug 321564 comment 9 

I did not see any enhancement request filed
Comment 2 Vladimir Vukicevic [:vlad] [:vladv] 2006-06-19 15:44:19 PDT
This kinda makes E4X not-very-useful for working with XMLHttpRequest responses, unless the app author does a bunch of string replaces on the input.  Considering that the response will already be parsed twice (once by XMLHttpRequest itself), and now we need to introduce yet another string copy, trying to use E4X is getting to be very expensive memory-wise unless you're dealing with very small xml snippets.
Comment 3 Hixie (not reading bugmail) 2006-06-19 15:47:05 PDT
Bug 270553 is the way to make E4X work with XMLHttpRequest, not this. IMHO.
Comment 4 Brendan Eich [:brendan] 2006-06-19 16:33:39 PDT
(In reply to comment #3)
> Bug 270553 is the way to make E4X work with XMLHttpRequest, not this. IMHO.

It's not a matter of opinion.  Sometimes, you need a DOM parsed from the XML returned by an XHR, other times you don't.  If you don't need a DOM, but you wish to use E4X (ok, it's another DOM in many ways) to manipulate the data, then you want this bug fixed, not the E4X <=> DOM bug.

IOW both bidirectional DOM/E4X mapping and just text/xml deserialization into E4X are valid use-cases.  Fixing the latter by patching this bug is easier.  Both bugs should be fixed.

/be
Comment 5 Robert Sayre 2006-09-09 18:05:20 PDT
*** Bug 351371 has been marked as a duplicate of this bug. ***
Comment 6 Brendan Eich [:brendan] 2006-09-11 10:15:11 PDT
Jeff, you want this one?  It'll require avoiding default namespace setting via <parent xmlns="..."> </parent> wrapping, I think.

/be
Comment 7 Jesse Thompson 2007-04-07 12:05:32 PDT
If I am tracking this properly, then you can still use that wrapping tactic, you would just need to exclude the xml declaration from the wrap.
Comment 8 Jiri Kopsa 2007-09-11 23:10:48 PDT
I assume that this bug is filed on Spidermonkey. I've just filed another one on Rhino, which has the same issue.
https://bugzilla.mozilla.org/show_bug.cgi?id=395853
Comment 9 Marc Diethelm 2008-01-06 10:36:17 PST
I don't think this is a trivial bug nor simply an enhancement.
Also in my experience the usability of E4X is mainly compromised when constructing XML for XHR, not when working with the response. For a response you do something like this:

var response = http_request.responseText; // bug 270553
response = response.replace('<?xml version="1.0"?>', ""); // bug 336551
var e4x = new XML(response);

It's unsexy but get your DOM allowing you to easily read the response. Constructing the document is different.
Why even use E4X when the main requirement is string concatenation (be it naive or abstracted) and you have to string concatenate with the PI anyway.

brendan's example (think XML):
var html = <html/>;
html.head.title = "My Page Title";
html.body.@bgcolor = "#e4e4e4";
html.body.form.@name = "myform";
html.body.form.@action = "someurl.jss";
html.body.form.@method = "post";
html.body.form.@onclick = "return somejs();";
html.body.form.input[0] = "";
html.body.form.input[0].@name = "test";

Strings!

Dear Mozillians, I'm requesting blocking 1.9. If Mozilla is to be the locomotive of "Web 2.0" this should be addressed soon. More and more extensions use XHR, E4X lubricates the process and this bug makes it unsexy.

I would request blocking 1.8.x if I dared. (I'd love to re-write my extension to use E4X for requests!)

Thank you for your time!
Comment 10 Mike Schroepfer 2008-01-18 12:29:18 PST
 
> Dear Mozillians, I'm requesting blocking 1.9. If Mozilla is to be the
> locomotive of "Web 2.0" this should be addressed soon. More and more extensions
> use XHR, E4X lubricates the process and this bug makes it unsexy.
> 
> I would request blocking 1.8.x if I dared. (I'd love to re-write my extension
> to use E4X for requests!)
> 
> Thank you for your time!
> 

We'd love to get this fixed but are way past new-feature time for FF3.  There's always a next release!


Comment 11 Ben Bucksch (:BenB) 2008-09-02 10:36:01 PDT
It's next release time now.
This is a very silly bug that pretty much anybody trying to use E4x will trip over and scratch and then shake their head over.

Workaround is easy:
contents = contents.replace(/<\?xml[^>]*\?>/, ""); // Bug 336551 trips over <?xml ... >
but a) you have to know first and b) it's annoying to do this everytime you want to process XML (or to have to create your own wrapper around |new XML()| - in every project you use E4X in).
Comment 12 Mike Moening 2008-10-13 14:21:38 PDT
Simple bug. Can we fix it now?
Comment 13 Chris Evans 2008-10-27 20:16:31 PDT
Be careful when fixing this; it may introduce security issues due to the ability to <script src=""> a remote XHTML document which is parsed as E4X. There are more details at:
http://code.google.com/p/doctype/wiki/ArticleE4XSecurity

If supporting XML prolog syntax, keeping <!DOCTYPE... as an error would be prudent as that will continue to prevent many remote XHTML documents parsing as E4X.

Alternatively, fix this bug hand in hand with one of the other proposed E4X security measures:
- Require remote-domain script to serve with the correct Javascript MIME type, otherwise disable E4X.
- Do not support E4X anonymous inline XML initialization at all for remote-domain script.
Comment 14 Biju 2009-04-05 11:02:05 PDT
also see Bug 321685
Comment 15 chris hofmann 2009-04-23 04:16:20 PDT
sounds like might be late in fx 3.5 cycle to do this considering security things that should be looked at.  should it be moved to 3.next?
Comment 16 Ben Bucksch (:BenB) 2009-04-23 04:23:34 PDT
This is one of *the* two E4X blocking bugs.
E4X is a really cool feature, but this bug is just embarrassing.

Re security, I think Chris, when mentioning the security problem in comment 13, also mentioned good stopgaps for it.
Comment 17 Chris Evans 2009-04-28 10:47:46 PDT
Updating with some recent research on E4X security....

Demo: https://cevans-app.appspot.com/static/e4xtheft.html

This actually steals significant data cross-domain, based on a plausible XML syntax for a webmail XML feed. Here's the XML from within which we steal a bit:

<inbox><mail id="1234"><from>evil@hacker.com</from><subject>{ x = '</subject><body>PWN...</body></mail><mail id="1235"><from>bank@bank.com</from><subject>Super sensitive!</subject><body>New pin: 9976</body></mail><mail id="1236"><subject>' }</subject><body>...ed!!</body></mail></inbox>

As can be seen, the attacker has injected E4X content by simply e-mailing the victim with "interesting" subjects :) This causes the following E4X inline Javascript to evaluate:

{ x = '</subject><body>PWN...</body></mail><mail id="1235"><from>bank@bank.com</from><subject>Super sensitive!</subject><body>New pin: 9976</body></mail><mail id="1236"><subject>' }

Which as you can see pulls some of the XML into an inline assignment of a string to the "x" variable. It works because the repeating XML element structure does not result in mismatches.

A couple of contraints for this to work:
- Newlines will break this attack.
- As per above, an XML prolog will prevent this attack due to an inline XML parse error.
- Any XML emitter that escapes single quotes in XML values will break this attack; but in practice most people don't do this because it is not required to prevent XML injection or any other security threat.
Comment 18 chris hofmann 2009-08-07 19:58:57 PDT
now is a good time to figure out if we are planning to do something for this on the next release.
Comment 19 Biju 2009-08-07 20:40:59 PDT
After seeing concerns in comment #17, I feel it is OK to implement.
   new XML("<?xml ... ?><x><y/></x>");
But not 
   xml = <?xml ... ?><x><y/></x>;
    
or in other words, it is OK to be in the parameter passed to XML() function. but not to be as inline literal. And that solves the important use cases.
Comment 20 Brendan Eich [:brendan] 2009-08-07 21:52:39 PDT
(In reply to comment #17)
> Updating with some recent research on E4X security....
> 
> Demo: https://cevans-app.appspot.com/static/e4xtheft.html

This is defeated in Firefox 3.5:

Error: XML cannot be the whole program
Source File: https://cevans-secure.appspot.com/static/e4xtheft.xml
Line: 2, Column: 31
Source Code:
y>...ed!!</body></mail></inbox> 

See bug 375250, resolved FIXED, on which this bug depends.

/be
Comment 21 Damon Sicore (:damons) 2009-09-17 11:07:17 PDT
If someone attaches a patch, we'll take it.  But, we're not going apply resources to this.  Per bkap.
Comment 22 Brendan Eich [:brendan] 2009-09-17 14:04:40 PDT
From private correspondence with Michael Daumling of Adobe, I learned he extended E4X in Flash to allow the top-level production parsed from the string passed to new XML to be an XMLList's children, implicitly bracketed by <> and </>. He must have set default namespace by a private API (which is fine) instead of following ECMA-357's notorious implicit <parent xmlns="..."> ... </parent> wrapping.

Anyone who wants to patch this should try to match what E4X in Flash/Flex does.

/be
Comment 23 Seth Spitzer 2011-07-21 08:58:15 PDT
On a related note as it might help others:

I have some Javascript in a JS XPCOM component that was doing:

response = response.replace(/^<\?xml\s+version\s*=\s*(["'])[^\1]+\1[^?]*\?>/, ""); // bug 336551

That used to work, but I now get a JavaScript error:

line 701: regular expression too complex

That regex comes from https://developer.mozilla.org/en/e4x

I'm not sure if other people (like add-on developers) are running into this (see http://forums.mozillazine.org/viewtopic.php?f=23&t=1971533)

This might be related to bug #564953

My apologies for not having a regression range.
Comment 25 Chris Leary [:cdleary] (not checking bugmail) 2011-07-21 17:38:06 PDT
Thanks for the additional info, Seth. If the old RegExp still does not work on Mozilla/Firefox trunk, please report it as a new bug, but I think what you're seeing may be a remnant from PCRE, which has since been removed.

In the meantime, marking this bug as WNF. Please file a new bug for contributed E4X enhancements, per comment 21.
Comment 26 Ben Bucksch (:BenB) 2011-07-22 02:06:11 PDT
REOPEN. We don't close bugs as WONTFIX just because you don't want to commit resources, but leave them open for contributors.

This bug is critical for E4X, as pretty much every user of E4X runs into this and has to use the above posted workaround. (I agree this bug is not about the workaround breaking, but the fact was worth noting it here.)
Comment 27 Mike Shaver (:shaver -- probably not reading bugmail closely) 2011-07-22 05:02:08 PDT
I think we should WONTFIX it because every calorie directed towards E4X is a calorie misdirected.  I don't think it's "critical for E4X" because it has never been supported, and there are (perhaps unfortunately) users of E4X.

You could always seek to change the standard to permit this, of course, if you could find enough interest in the world to touch it.
Comment 28 Chris Leary [:cdleary] (not checking bugmail) 2011-07-22 09:26:36 PDT
(In reply to comment #27)

Meh, same argument for calories per languishing resolution. Was just trying to do a little cleanup while I was in here responding. ;-)
Comment 29 Ben Bucksch (:BenB) 2011-07-27 04:20:56 PDT
> there are (perhaps unfortunately) users of E4X.

And *all* of them run into this bug, and use the workaround. (And that workaround reportedly broke now.)
Comment 30 :Benjamin Peterson 2012-08-06 14:52:03 PDT
E4X is being phased out now, so it's definitely won't fix.

Note You need to log in before you can comment on or make changes to this bug.