remove UTF-7 from browser encoding menus

RESOLVED FIXED in mozilla1.9.1a1

Status

()

Core
Internationalization
RESOLVED FIXED
9 years ago
9 years ago

People

(Reporter: shaver, Assigned: Gavin)

Tracking

({verified1.8.1.17, verified1.9.0.2})

Trunk
mozilla1.9.1a1
verified1.8.1.17, verified1.9.0.2
Points:
---
Bug Flags:
blocking1.9.0.1 -
wanted1.9.0.x +
blocking1.8.1.17 +
wanted1.8.1.x +

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [sg:want P3], URL)

Attachments

(5 attachments, 2 obsolete attachments)

UTF-7 is a security hazard, and is not used on the web by legitimate content.  We should remove it from the list of available options, such that attacks like that from bug 408457 are not as easy to target at our users.

httpbis will recommend that browsers never guess UTF-7 (http://www3.tools.ietf.org/wg/httpbis/trac/ticket/20), but I would go farther and say that we shouldn't expose it as an overridable option.

Setting .notForBrowser in charset.properties works except for people who have already selected UTF-7 at some point, in which case it's probably in their .cache pref and will show up.  I think that's enough of an edge case that it's not worth forcing a reset or changing the name of the pref we use.

Would like this in 3.0.1.
Flags: wanted1.9.0.x+
Flags: blocking1.9.0.1?
Created attachment 326767 [details] [diff] [review]
patch

This works for me - both InitMoreSubmenus and InitMoreMenu avoid adding items marked notForBrowser. InitCacheMenu doesn't take into account "notForBrowser" (perhaps it should?), but you'd need to have previously selected UTF-7 for it to show up there, so I agree with shaver that we probably don't need to worry about this.
Assignee: smontagu → gavin.sharp
Status: NEW → ASSIGNED
Attachment #326767 - Flags: superreview?(bzbarsky)
Attachment #326767 - Flags: review?(smontagu)
The patch doesn't work for me in seamonkey. I think you should remove UTF-7 from http://mxr.mozilla.org/seamonkey/source/toolkit/locales/en-US/chrome/global/intl.properties#32 and http://mxr.mozilla.org/seamonkey/source/suite/locales/en-US/chrome/browser/navigator.properties#36 also.

(In reply to comment #0)
> Setting .notForBrowser in charset.properties works except for people who have
> already selected UTF-7 at some point, in which case it's probably in their
> .cache pref and will show up.

What about people who have not selected UTF-7 manually, but have visited a UTF-7 page, perhaps by clicking on an evil link? Will UTF-7 appear among the cached encodings in that scenario? How hard would it be to make InitCacheMenu take "notForBrowser" into account?
That's worth testing, but my earlier reading of the code made me believe that we didn't populate the cache from visited pages, only user overrides.  (It would be much less useful if it were populated from every page that had a charset declared.)

If you find otherwise, we should fix that problem in a separate bug.  I'd be OK with teaching InitCacheMenu that trick, but I don't think it's necessary.

Comment 4

9 years ago
About the "evil of UTF-7" and how it should never be "guessed"
(or used even?), see also:
http://www.securityfocus.com/bid/29112/references
particularly
http://www.securityfocus.com/archive/1/492094
Created attachment 326836 [details] [diff] [review]
slightly more involved patch...

OK, I looked into this a bit more... this code is a mess :(

InitMoreMenu (the method that ends up filling the "More Encodings" itself) does filter out notForBrowser entries (by calling RemoveFlaggedCharsets), but InitMoreSubmenus (the method that fills up the "More Encodings", submenus including "Unicode") doesn't do any filtering, and just fills things in based on the intl.charsetmenu.browser.more* and intl.charsetmenu.browser.unicode prefs. That probably explains why the previous patch doesn't work for you - removes one of the "UTF-7", but not the other. I must have had that pref change still in when I tested.

So we need to edit the pref (and SeaMonkey uses it's own hardcoded values here as far as I can tell):

http://mxr.mozilla.org/seamonkey/source/suite/locales/en-US/chrome/browser/navigator.properties#35

(And file a bug on porting this fix to SeaMonkey, and perhaps have them start using the toolkit file rather using their own prefs: http://mxr.mozilla.org/seamonkey/source/toolkit/locales/en-US/chrome/global/intl.properties#32 )

None of this covers the cache menu, though. For that, had InitBrowserMenu and RefreshBrowserMenu (methods that populate the cache for "browser" charset menus) call RemoveFlaggedCharsets before InitCacheMenu, to match InitMoreMenu. Pretty sure this code is using strings inefficiently, and all that array copying can't be very good, but I'd rather not get into refactoring of this code. Also needed to have SetCurrentCharset refuse to add "notForBrowser" charsets.

Finally, since the UTF-7 menuitem won't show up in the menu at all, even when you're on a page that uses it, I had to update the code that updates the checkbox to have it always remove the old checkbox, regardless of whether we're checking a new checkbox. This means that the menu won't have any checked item on UTF-7 pages, but I don't think that's a problem.
Attachment #326767 - Attachment is obsolete: true
Attachment #326836 - Flags: superreview?(bzbarsky)
Attachment #326836 - Flags: review?(smontagu)
Attachment #326767 - Flags: superreview?(bzbarsky)
Attachment #326767 - Flags: review?(smontagu)
(In reply to comment #3)
> That's worth testing, but my earlier reading of the code made me believe that
> we didn't populate the cache from visited pages, only user overrides.  (It
> would be much less useful if it were populated from every page that had a
> charset declared.)

Turns out that that is the case... charsetLoadListener() in browser.js calls nsCharsetMenu::SetCurrentCharset, which updates the cache.

> If you find otherwise, we should fix that problem in a separate bug.  I'd be
> OK with teaching InitCacheMenu that trick, but I don't think it's necessary.

Didn't see your comment before patching... don't really mind splitting it off if we want to minimize changes.
Attachment #326836 - Flags: review?(smontagu) → review+

Updated

9 years ago
Blocks: 301375
Whiteboard: [sg:want P3]
Comment on attachment 326836 [details] [diff] [review]
slightly more involved patch...

>+++ b/browser/base/content/browser.js
...
>+++ b/toolkit/content/charsetOverlay.js

Is there a bug on this code duplication?  If not, can it be filed, please?

>+++ b/xpfe/components/intl/nsCharsetMenu.cpp
>+  nsAutoString prop; prop.AssignWithConversion(".notForBrowser");
>+  res = RemoveFlaggedCharsets(decs, &prop);

Given what RemoveFlaggedCharsets actually does with its second argument, I think that second argument should be a |const nsString&|.  Then you can pass in an NS_LITERAL STRING in your two new callers and not have to mess around with charset conversions, etc.

sr=bzbarsky with those nits addressed
Attachment #326836 - Flags: superreview?(bzbarsky) → superreview+
This missed 1.9.0.1, but if the patch is ready, it should be nominated with rationale for 1.9.0.2
Flags: blocking1.9.0.1? → blocking1.9.0.1-
Created attachment 328061 [details] [diff] [review]
updated to comments

bz, can you just take a quick look over the string changes in nsCharsetMenu.cpp?
Attachment #326836 - Attachment is obsolete: true
Attachment #328061 - Flags: superreview?(bzbarsky)
Filed bug 443514 on the code duplication.
Comment on attachment 328061 [details] [diff] [review]
updated to comments

That looks great.
Attachment #328061 - Flags: superreview?(bzbarsky) → superreview+

Comment 12

9 years ago
The way I understand things... (please check and correct if needed).

We drop UTF-7 from menus where that has not been used, do not drop for
pages that are UTF-7 (e.g. because its headers said so). Then this does
not protect bug 408457: the attacker could start with an UTF-7 page.
(In reply to comment #12)
> We drop UTF-7 from menus where that has not been used, do not drop for
> pages that are UTF-7 (e.g. because its headers said so).

No, we drop the UTF-7 option from all menus, all the time. If the page specifies UTF-7, we'll still use it, but nothing will be checked in the menu.

Comment 14

9 years ago
Does that mean that if you switch from UTF-7 to something else, you'd be unable to return to UTF-7 even though the page specifies UTF-7?
Without reloading the page, yes.

Comment 16

9 years ago
Thanks for the explanation. Now I understand that
> ... nothing will be checked in the menu
means "present", not "ticked".

But then, comment #5 (and comment #13) says:
> ... the menu won't have any checked item
> on UTF-7 pages, but I don't think that's a problem.
thus comment #14 is moot. (I do not think that is a problem.)
Pushed to m-c:
http://hg.mozilla.org/mozilla-central/index.cgi/rev/e64a979ed64c

I just realized I'm going to have to update all of the localized intl.properties, since UTF-7 is included in all of those as well. Seems like some of the strings in that file shouldn't be in a localized file.
Status: ASSIGNED → RESOLVED
Last Resolved: 9 years ago
Resolution: --- → FIXED

Comment 18

9 years ago
In bug 406777 comment 1, similar problems are mentioned with other encodings.
Did the removal of UTF-7 achieve anything, in the sense of helping bug 408457 ? (Jesse's example of "UTF-16 U+203C encoded as 0x203C" seems harmless enough
as most sites would filter out the 0x3C.)

See bug 406777 comment 16 also.
Created attachment 328241 [details] [diff] [review]
1.9 patch
Created attachment 328242 [details] [diff] [review]
1.9 l10n patch
Attachment #328242 - Flags: review?(l10n)
Attachment #328241 - Attachment description: branch patch → 1.9 patch
Attachment #328242 - Attachment description: l10n patch → 1.9 l10n patch
(In reply to comment #18)
> In bug 406777 comment 1, similar problems are mentioned with other encodings.
> Did the removal of UTF-7 achieve anything, in the sense of helping bug 408457 ?

It removes the most well-known attack vector. If you're aware of other encodings that should be given the same treatment feel free to file new bugs. As you know there are other bugs filed on addressing these problems in a more general way.

Comment 22

9 years ago
Does this bug remove UTF-7 from
View, CharacterEncoding, CustomizeList
also? 

Comment 23

9 years ago
> (Jesse's example of "UTF-16 U+203C encoded as 0x203C" seems harmless
> enough as most sites would filter out the 0x3C.)

UTF-16 sites wouldn't, which is half of my point.

Comment 24

9 years ago
> UTF-16 sites wouldn't ...

Are there any? :-)

So, what is the holdup with 408457? If we could put in all the energy into
making this patch, get it reviewed and accepted (though is useless at best,
or drops wanted functionality), why cannot 408457 be "done"? Even BZ had
the time to comment here and not the temerity to raise objections...

Oh well, this is much better than nothing. Going about it the MS way, not
tackling the essence but the "known attack vector", while keeping things
dark and mysterious. (Sorry Gavin, having a bad day.)
> Are there any? :-)

Care to stop trolling?

> So, what is the holdup with 408457? 

Needs code analysis.  As I've repeatedly said in that bug.

For what it's worth, my total time spent on this bug is under 8 minutes including this comment.  And since this patch doesn't significantly complicate already-complicated code I had no objections to it as an sr (and it had module owner approval already, note).
Put another way, as far as I can tell neither patch "tackles the essence".  The other may (as in, I may be wrong), but some evidence is needed.
(In reply to comment #22)
> Does this bug remove UTF-7 from View, CharacterEncoding, CustomizeList also?

Yes.
Comment on attachment 328242 [details] [diff] [review]
1.9 l10n patch

r=me
Attachment #328242 - Flags: review?(l10n) → review+
Attachment #328241 - Flags: approval1.9.0.2?
Attachment #328242 - Flags: approval1.9.0.2?
Comment on attachment 328241 [details] [diff] [review]
1.9 patch

Approved for 1.9.0.2. Please land in CVS. a=ss

(And, any way to add a test for this? Or do we need UI-level tests? ... go go Gristmill.)
Attachment #328241 - Flags: approval1.9.0.2? → approval1.9.0.2+
Comment on attachment 328242 [details] [diff] [review]
1.9 l10n patch

Approved for 1.9.0.2. Please land in CVS. a=ss

(Please be sure to add the fixed1.9.0.2 keyword only after both patches have landed.)
Attachment #328242 - Flags: approval1.9.0.2? → approval1.9.0.2+
mozilla/toolkit/content/charsetOverlay.js 	1.13
mozilla/browser/base/content/browser.js 	1.1035
mozilla/intl/uconv/src/charsetData.properties 	1.48
mozilla/xpfe/components/intl/nsCharsetMenu.cpp 	1.48
mozilla/toolkit/locales/en-US/chrome/global/intl.properties 	1.8

Also checked in the l10n patch (l10n bonsai is really slow, though).
Keywords: fixed1.9.0.2
Target Milestone: --- → mozilla1.9.1a1
Gavin, can you please create a patch (en-US and l10n) for the 1.8 branch for this?
Flags: wanted1.8.1.x+
Flags: blocking1.8.1.17+
Sorry, I've been away for a while. I'll try to get this done tomorrow.
Created attachment 334359 [details] [diff] [review]
1.8 patch
Attachment #334359 - Flags: approval1.8.1.17?
Created attachment 334360 [details] [diff] [review]
1.8 l10n patch
Attachment #334360 - Flags: approval1.8.1.17?
Axel, the 1.8 triage team will go through and approve the en-US patch, but can you approve the l10n one?
I'm fine with carrying over my 1.9.0.x approval over to the 1.8 branch. As the l10n patch depends on the en-US one, I'd rather see that approval go first, but from my point of view, the two get approved as one as soon as en-US goes in.
Comment on attachment 334359 [details] [diff] [review]
1.8 patch

Approved for 1.8.1.17, a=dveditz for release-drivers.
Attachment #334359 - Flags: approval1.8.1.17? → approval1.8.1.17+
Comment on attachment 334360 [details] [diff] [review]
1.8 l10n patch

Approved for 1.8.1.17, a=dveditz for release-drivers.
Attachment #334360 - Flags: approval1.8.1.17? → approval1.8.1.17+
Checked in both the /l10n and the /mozilla patches on MOZILLA_1_8_BRANCH.

http://bonsai.mozilla.org/cvsquery.cgi?branch=MOZILLA_1_8_BRANCH&branchtype=match&date=explicit&mindate=2008-08-25+12%3A48&maxdate=2008-08-25+12%3A48&cvsroot=%2Fcvsroot
http://bonsai-l10n.mozilla.org/cvsquery.cgi?branch=MOZILLA_1_8_BRANCH&branchtype=match&date=explicit&mindate=2008-08-25+12%3A49&maxdate=2008-08-25+12%3A49&cvsroot=%2Fl10n
Keywords: fixed1.8.1.17
Verified, UTF-7 no longer appears as an item in the character encoding list in the latest build candidates for 20017/3.0.2 on Win/Mac/Linux on en-US and at least a couple of locales, pl and zh-CN for instance.

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.4; zh-CN; rv:1.9.0.2) Gecko/2008082909 Firefox/3.0.2

Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.17) Gecko/2008082909 Firefox/2.0.0.17
Keywords: fixed1.8.1.17, fixed1.9.0.2 → verified1.8.1.17, verified1.9.0.2
You need to log in before you can comment on or make changes to this bug.