Last Comment Bug 441876 - remove UTF-7 from browser encoding menus
: remove UTF-7 from browser encoding menus
Status: RESOLVED FIXED
[sg:want P3]
: verified1.8.1.17, verified1.9.0.2
Product: Core
Classification: Components
Component: Internationalization (show other bugs)
: Trunk
: All All
: -- normal (vote)
: mozilla1.9.1a1
Assigned To: :Gavin Sharp [email: gavin@gavinsharp.com]
:
Mentors:
http://lists.w3.org/Archives/Public/i...
Depends on:
Blocks: xss
  Show dependency treegraph
 
Reported: 2008-06-25 12:13 PDT by Mike Shaver (:shaver -- probably not reading bugmail closely)
Modified: 2008-08-29 16:55 PDT (History)
12 users (show)
mbeltzner: blocking1.9.0.1-
shaver: wanted1.9.0.x+
samuel.sidler+old: blocking1.8.1.17+
samuel.sidler+old: wanted1.8.1.x+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
patch (1001 bytes, patch)
2008-06-25 12:43 PDT, :Gavin Sharp [email: gavin@gavinsharp.com]
no flags Details | Diff | Review
slightly more involved patch... (9.36 KB, patch)
2008-06-25 20:48 PDT, :Gavin Sharp [email: gavin@gavinsharp.com]
smontagu: review+
bzbarsky: superreview+
Details | Diff | Review
updated to comments (11.98 KB, patch)
2008-07-03 16:26 PDT, :Gavin Sharp [email: gavin@gavinsharp.com]
bzbarsky: superreview+
Details | Diff | Review
1.9 patch (13.00 KB, patch)
2008-07-05 20:17 PDT, :Gavin Sharp [email: gavin@gavinsharp.com]
samuel.sidler+old: approval1.9.0.2+
Details | Diff | Review
1.9 l10n patch (89.96 KB, patch)
2008-07-05 20:18 PDT, :Gavin Sharp [email: gavin@gavinsharp.com]
l10n: review+
samuel.sidler+old: approval1.9.0.2+
Details | Diff | Review
1.8 patch (13.09 KB, patch)
2008-08-18 16:13 PDT, :Gavin Sharp [email: gavin@gavinsharp.com]
dveditz: approval1.8.1.17+
Details | Diff | Review
1.8 l10n patch (82.35 KB, patch)
2008-08-18 16:13 PDT, :Gavin Sharp [email: gavin@gavinsharp.com]
dveditz: approval1.8.1.17+
Details | Diff | Review

Description Mike Shaver (:shaver -- probably not reading bugmail closely) 2008-06-25 12:13:04 PDT
UTF-7 is a security hazard, and is not used on the web by legitimate content.  We should remove it from the list of available options, such that attacks like that from bug 408457 are not as easy to target at our users.

httpbis will recommend that browsers never guess UTF-7 (http://www3.tools.ietf.org/wg/httpbis/trac/ticket/20), but I would go farther and say that we shouldn't expose it as an overridable option.

Setting .notForBrowser in charset.properties works except for people who have already selected UTF-7 at some point, in which case it's probably in their .cache pref and will show up.  I think that's enough of an edge case that it's not worth forcing a reset or changing the name of the pref we use.

Would like this in 3.0.1.
Comment 1 :Gavin Sharp [email: gavin@gavinsharp.com] 2008-06-25 12:43:27 PDT
Created attachment 326767 [details] [diff] [review]
patch

This works for me - both InitMoreSubmenus and InitMoreMenu avoid adding items marked notForBrowser. InitCacheMenu doesn't take into account "notForBrowser" (perhaps it should?), but you'd need to have previously selected UTF-7 for it to show up there, so I agree with shaver that we probably don't need to worry about this.
Comment 2 Simon Montagu :smontagu 2008-06-25 14:43:29 PDT
The patch doesn't work for me in seamonkey. I think you should remove UTF-7 from http://mxr.mozilla.org/seamonkey/source/toolkit/locales/en-US/chrome/global/intl.properties#32 and http://mxr.mozilla.org/seamonkey/source/suite/locales/en-US/chrome/browser/navigator.properties#36 also.

(In reply to comment #0)
> Setting .notForBrowser in charset.properties works except for people who have
> already selected UTF-7 at some point, in which case it's probably in their
> .cache pref and will show up.

What about people who have not selected UTF-7 manually, but have visited a UTF-7 page, perhaps by clicking on an evil link? Will UTF-7 appear among the cached encodings in that scenario? How hard would it be to make InitCacheMenu take "notForBrowser" into account?
Comment 3 Mike Shaver (:shaver -- probably not reading bugmail closely) 2008-06-25 18:17:09 PDT
That's worth testing, but my earlier reading of the code made me believe that we didn't populate the cache from visited pages, only user overrides.  (It would be much less useful if it were populated from every page that had a charset declared.)

If you find otherwise, we should fix that problem in a separate bug.  I'd be OK with teaching InitCacheMenu that trick, but I don't think it's necessary.
Comment 4 Paul Szabo 2008-06-25 18:45:26 PDT
About the "evil of UTF-7" and how it should never be "guessed"
(or used even?), see also:
http://www.securityfocus.com/bid/29112/references
particularly
http://www.securityfocus.com/archive/1/492094
Comment 5 :Gavin Sharp [email: gavin@gavinsharp.com] 2008-06-25 20:48:44 PDT
Created attachment 326836 [details] [diff] [review]
slightly more involved patch...

OK, I looked into this a bit more... this code is a mess :(

InitMoreMenu (the method that ends up filling the "More Encodings" itself) does filter out notForBrowser entries (by calling RemoveFlaggedCharsets), but InitMoreSubmenus (the method that fills up the "More Encodings", submenus including "Unicode") doesn't do any filtering, and just fills things in based on the intl.charsetmenu.browser.more* and intl.charsetmenu.browser.unicode prefs. That probably explains why the previous patch doesn't work for you - removes one of the "UTF-7", but not the other. I must have had that pref change still in when I tested.

So we need to edit the pref (and SeaMonkey uses it's own hardcoded values here as far as I can tell):

http://mxr.mozilla.org/seamonkey/source/suite/locales/en-US/chrome/browser/navigator.properties#35

(And file a bug on porting this fix to SeaMonkey, and perhaps have them start using the toolkit file rather using their own prefs: http://mxr.mozilla.org/seamonkey/source/toolkit/locales/en-US/chrome/global/intl.properties#32 )

None of this covers the cache menu, though. For that, had InitBrowserMenu and RefreshBrowserMenu (methods that populate the cache for "browser" charset menus) call RemoveFlaggedCharsets before InitCacheMenu, to match InitMoreMenu. Pretty sure this code is using strings inefficiently, and all that array copying can't be very good, but I'd rather not get into refactoring of this code. Also needed to have SetCurrentCharset refuse to add "notForBrowser" charsets.

Finally, since the UTF-7 menuitem won't show up in the menu at all, even when you're on a page that uses it, I had to update the code that updates the checkbox to have it always remove the old checkbox, regardless of whether we're checking a new checkbox. This means that the menu won't have any checked item on UTF-7 pages, but I don't think that's a problem.
Comment 6 :Gavin Sharp [email: gavin@gavinsharp.com] 2008-06-25 20:50:58 PDT
(In reply to comment #3)
> That's worth testing, but my earlier reading of the code made me believe that
> we didn't populate the cache from visited pages, only user overrides.  (It
> would be much less useful if it were populated from every page that had a
> charset declared.)

Turns out that that is the case... charsetLoadListener() in browser.js calls nsCharsetMenu::SetCurrentCharset, which updates the cache.

> If you find otherwise, we should fix that problem in a separate bug.  I'd be
> OK with teaching InitCacheMenu that trick, but I don't think it's necessary.

Didn't see your comment before patching... don't really mind splitting it off if we want to minimize changes.
Comment 7 Boris Zbarsky [:bz] (Out June 25-July 6) 2008-06-27 22:43:08 PDT
Comment on attachment 326836 [details] [diff] [review]
slightly more involved patch...

>+++ b/browser/base/content/browser.js
...
>+++ b/toolkit/content/charsetOverlay.js

Is there a bug on this code duplication?  If not, can it be filed, please?

>+++ b/xpfe/components/intl/nsCharsetMenu.cpp
>+  nsAutoString prop; prop.AssignWithConversion(".notForBrowser");
>+  res = RemoveFlaggedCharsets(decs, &prop);

Given what RemoveFlaggedCharsets actually does with its second argument, I think that second argument should be a |const nsString&|.  Then you can pass in an NS_LITERAL STRING in your two new callers and not have to mess around with charset conversions, etc.

sr=bzbarsky with those nits addressed
Comment 8 Mike Beltzner [:beltzner, not reading bugmail] 2008-06-30 21:44:06 PDT
This missed 1.9.0.1, but if the patch is ready, it should be nominated with rationale for 1.9.0.2
Comment 9 :Gavin Sharp [email: gavin@gavinsharp.com] 2008-07-03 16:26:45 PDT
Created attachment 328061 [details] [diff] [review]
updated to comments

bz, can you just take a quick look over the string changes in nsCharsetMenu.cpp?
Comment 10 :Gavin Sharp [email: gavin@gavinsharp.com] 2008-07-03 16:28:46 PDT
Filed bug 443514 on the code duplication.
Comment 11 Boris Zbarsky [:bz] (Out June 25-July 6) 2008-07-03 20:57:56 PDT
Comment on attachment 328061 [details] [diff] [review]
updated to comments

That looks great.
Comment 12 Paul Szabo 2008-07-04 19:54:07 PDT
The way I understand things... (please check and correct if needed).

We drop UTF-7 from menus where that has not been used, do not drop for
pages that are UTF-7 (e.g. because its headers said so). Then this does
not protect bug 408457: the attacker could start with an UTF-7 page.
Comment 13 :Gavin Sharp [email: gavin@gavinsharp.com] 2008-07-05 12:20:23 PDT
(In reply to comment #12)
> We drop UTF-7 from menus where that has not been used, do not drop for
> pages that are UTF-7 (e.g. because its headers said so).

No, we drop the UTF-7 option from all menus, all the time. If the page specifies UTF-7, we'll still use it, but nothing will be checked in the menu.
Comment 14 Jesse Ruderman 2008-07-05 12:57:48 PDT
Does that mean that if you switch from UTF-7 to something else, you'd be unable to return to UTF-7 even though the page specifies UTF-7?
Comment 15 :Gavin Sharp [email: gavin@gavinsharp.com] 2008-07-05 13:32:01 PDT
Without reloading the page, yes.
Comment 16 Paul Szabo 2008-07-05 14:46:41 PDT
Thanks for the explanation. Now I understand that
> ... nothing will be checked in the menu
means "present", not "ticked".

But then, comment #5 (and comment #13) says:
> ... the menu won't have any checked item
> on UTF-7 pages, but I don't think that's a problem.
thus comment #14 is moot. (I do not think that is a problem.)
Comment 17 :Gavin Sharp [email: gavin@gavinsharp.com] 2008-07-05 18:58:35 PDT
Pushed to m-c:
http://hg.mozilla.org/mozilla-central/index.cgi/rev/e64a979ed64c

I just realized I'm going to have to update all of the localized intl.properties, since UTF-7 is included in all of those as well. Seems like some of the strings in that file shouldn't be in a localized file.
Comment 18 Paul Szabo 2008-07-05 19:45:52 PDT
In bug 406777 comment 1, similar problems are mentioned with other encodings.
Did the removal of UTF-7 achieve anything, in the sense of helping bug 408457 ? (Jesse's example of "UTF-16 U+203C encoded as 0x203C" seems harmless enough
as most sites would filter out the 0x3C.)

See bug 406777 comment 16 also.
Comment 19 :Gavin Sharp [email: gavin@gavinsharp.com] 2008-07-05 20:17:48 PDT
Created attachment 328241 [details] [diff] [review]
1.9 patch
Comment 20 :Gavin Sharp [email: gavin@gavinsharp.com] 2008-07-05 20:18:17 PDT
Created attachment 328242 [details] [diff] [review]
1.9 l10n patch
Comment 21 :Gavin Sharp [email: gavin@gavinsharp.com] 2008-07-05 20:24:56 PDT
(In reply to comment #18)
> In bug 406777 comment 1, similar problems are mentioned with other encodings.
> Did the removal of UTF-7 achieve anything, in the sense of helping bug 408457 ?

It removes the most well-known attack vector. If you're aware of other encodings that should be given the same treatment feel free to file new bugs. As you know there are other bugs filed on addressing these problems in a more general way.
Comment 22 Paul Szabo 2008-07-06 00:52:45 PDT
Does this bug remove UTF-7 from
View, CharacterEncoding, CustomizeList
also? 
Comment 23 Jesse Ruderman 2008-07-06 01:14:36 PDT
> (Jesse's example of "UTF-16 U+203C encoded as 0x203C" seems harmless
> enough as most sites would filter out the 0x3C.)

UTF-16 sites wouldn't, which is half of my point.
Comment 24 Paul Szabo 2008-07-06 04:03:05 PDT
> UTF-16 sites wouldn't ...

Are there any? :-)

So, what is the holdup with 408457? If we could put in all the energy into
making this patch, get it reviewed and accepted (though is useless at best,
or drops wanted functionality), why cannot 408457 be "done"? Even BZ had
the time to comment here and not the temerity to raise objections...

Oh well, this is much better than nothing. Going about it the MS way, not
tackling the essence but the "known attack vector", while keeping things
dark and mysterious. (Sorry Gavin, having a bad day.)
Comment 25 Boris Zbarsky [:bz] (Out June 25-July 6) 2008-07-06 07:28:33 PDT
> Are there any? :-)

Care to stop trolling?

> So, what is the holdup with 408457? 

Needs code analysis.  As I've repeatedly said in that bug.

For what it's worth, my total time spent on this bug is under 8 minutes including this comment.  And since this patch doesn't significantly complicate already-complicated code I had no objections to it as an sr (and it had module owner approval already, note).
Comment 26 Boris Zbarsky [:bz] (Out June 25-July 6) 2008-07-06 07:29:37 PDT
Put another way, as far as I can tell neither patch "tackles the essence".  The other may (as in, I may be wrong), but some evidence is needed.
Comment 27 :Gavin Sharp [email: gavin@gavinsharp.com] 2008-07-06 13:25:09 PDT
(In reply to comment #22)
> Does this bug remove UTF-7 from View, CharacterEncoding, CustomizeList also?

Yes.
Comment 28 Axel Hecht [:Pike] 2008-07-16 00:56:11 PDT
Comment on attachment 328242 [details] [diff] [review]
1.9 l10n patch

r=me
Comment 29 Samuel Sidler (old account; do not CC) 2008-07-23 02:20:44 PDT
Comment on attachment 328241 [details] [diff] [review]
1.9 patch

Approved for 1.9.0.2. Please land in CVS. a=ss

(And, any way to add a test for this? Or do we need UI-level tests? ... go go Gristmill.)
Comment 30 Samuel Sidler (old account; do not CC) 2008-07-23 02:21:04 PDT
Comment on attachment 328242 [details] [diff] [review]
1.9 l10n patch

Approved for 1.9.0.2. Please land in CVS. a=ss

(Please be sure to add the fixed1.9.0.2 keyword only after both patches have landed.)
Comment 31 :Gavin Sharp [email: gavin@gavinsharp.com] 2008-07-23 11:10:01 PDT
mozilla/toolkit/content/charsetOverlay.js 	1.13
mozilla/browser/base/content/browser.js 	1.1035
mozilla/intl/uconv/src/charsetData.properties 	1.48
mozilla/xpfe/components/intl/nsCharsetMenu.cpp 	1.48
mozilla/toolkit/locales/en-US/chrome/global/intl.properties 	1.8

Also checked in the l10n patch (l10n bonsai is really slow, though).
Comment 32 Samuel Sidler (old account; do not CC) 2008-08-11 11:28:04 PDT
Gavin, can you please create a patch (en-US and l10n) for the 1.8 branch for this?
Comment 33 :Gavin Sharp [email: gavin@gavinsharp.com] 2008-08-18 01:30:07 PDT
Sorry, I've been away for a while. I'll try to get this done tomorrow.
Comment 34 :Gavin Sharp [email: gavin@gavinsharp.com] 2008-08-18 16:13:19 PDT
Created attachment 334359 [details] [diff] [review]
1.8 patch
Comment 35 :Gavin Sharp [email: gavin@gavinsharp.com] 2008-08-18 16:13:25 PDT
Created attachment 334360 [details] [diff] [review]
1.8 l10n patch
Comment 36 Samuel Sidler (old account; do not CC) 2008-08-18 17:05:26 PDT
Axel, the 1.8 triage team will go through and approve the en-US patch, but can you approve the l10n one?
Comment 37 Axel Hecht [:Pike] 2008-08-18 17:11:38 PDT
I'm fine with carrying over my 1.9.0.x approval over to the 1.8 branch. As the l10n patch depends on the en-US one, I'd rather see that approval go first, but from my point of view, the two get approved as one as soon as en-US goes in.
Comment 38 Daniel Veditz [:dveditz] 2008-08-20 14:55:38 PDT
Comment on attachment 334359 [details] [diff] [review]
1.8 patch

Approved for 1.8.1.17, a=dveditz for release-drivers.
Comment 39 Daniel Veditz [:dveditz] 2008-08-20 14:55:52 PDT
Comment on attachment 334360 [details] [diff] [review]
1.8 l10n patch

Approved for 1.8.1.17, a=dveditz for release-drivers.
Comment 41 juan becerra [:juanb] 2008-08-29 16:55:29 PDT
Verified, UTF-7 no longer appears as an item in the character encoding list in the latest build candidates for 20017/3.0.2 on Win/Mac/Linux on en-US and at least a couple of locales, pl and zh-CN for instance.

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.4; zh-CN; rv:1.9.0.2) Gecko/2008082909 Firefox/3.0.2

Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.17) Gecko/2008082909 Firefox/2.0.0.17

Note You need to log in before you can comment on or make changes to this bug.