Closed Bug 441876 Opened 16 years ago Closed 16 years ago

remove UTF-7 from browser encoding menus

Categories

(Core :: Internationalization, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla1.9.1a1

People

(Reporter: shaver, Assigned: Gavin)

References

()

Details

(Keywords: verified1.8.1.17, verified1.9.0.2, Whiteboard: [sg:want P3])

Attachments

(5 files, 2 obsolete files)

UTF-7 is a security hazard, and is not used on the web by legitimate content.  We should remove it from the list of available options, such that attacks like that from bug 408457 are not as easy to target at our users.

httpbis will recommend that browsers never guess UTF-7 (http://www3.tools.ietf.org/wg/httpbis/trac/ticket/20), but I would go farther and say that we shouldn't expose it as an overridable option.

Setting .notForBrowser in charset.properties works except for people who have already selected UTF-7 at some point, in which case it's probably in their .cache pref and will show up.  I think that's enough of an edge case that it's not worth forcing a reset or changing the name of the pref we use.

Would like this in 3.0.1.
Flags: wanted1.9.0.x+
Flags: blocking1.9.0.1?
Attached patch patch (obsolete) — Splinter Review
This works for me - both InitMoreSubmenus and InitMoreMenu avoid adding items marked notForBrowser. InitCacheMenu doesn't take into account "notForBrowser" (perhaps it should?), but you'd need to have previously selected UTF-7 for it to show up there, so I agree with shaver that we probably don't need to worry about this.
Assignee: smontagu → gavin.sharp
Status: NEW → ASSIGNED
Attachment #326767 - Flags: superreview?(bzbarsky)
Attachment #326767 - Flags: review?(smontagu)
The patch doesn't work for me in seamonkey. I think you should remove UTF-7 from http://mxr.mozilla.org/seamonkey/source/toolkit/locales/en-US/chrome/global/intl.properties#32 and http://mxr.mozilla.org/seamonkey/source/suite/locales/en-US/chrome/browser/navigator.properties#36 also.

(In reply to comment #0)
> Setting .notForBrowser in charset.properties works except for people who have
> already selected UTF-7 at some point, in which case it's probably in their
> .cache pref and will show up.

What about people who have not selected UTF-7 manually, but have visited a UTF-7 page, perhaps by clicking on an evil link? Will UTF-7 appear among the cached encodings in that scenario? How hard would it be to make InitCacheMenu take "notForBrowser" into account?
That's worth testing, but my earlier reading of the code made me believe that we didn't populate the cache from visited pages, only user overrides.  (It would be much less useful if it were populated from every page that had a charset declared.)

If you find otherwise, we should fix that problem in a separate bug.  I'd be OK with teaching InitCacheMenu that trick, but I don't think it's necessary.
About the "evil of UTF-7" and how it should never be "guessed"
(or used even?), see also:
http://www.securityfocus.com/bid/29112/references
particularly
http://www.securityfocus.com/archive/1/492094
Attached patch slightly more involved patch... (obsolete) — Splinter Review
OK, I looked into this a bit more... this code is a mess :(

InitMoreMenu (the method that ends up filling the "More Encodings" itself) does filter out notForBrowser entries (by calling RemoveFlaggedCharsets), but InitMoreSubmenus (the method that fills up the "More Encodings", submenus including "Unicode") doesn't do any filtering, and just fills things in based on the intl.charsetmenu.browser.more* and intl.charsetmenu.browser.unicode prefs. That probably explains why the previous patch doesn't work for you - removes one of the "UTF-7", but not the other. I must have had that pref change still in when I tested.

So we need to edit the pref (and SeaMonkey uses it's own hardcoded values here as far as I can tell):

http://mxr.mozilla.org/seamonkey/source/suite/locales/en-US/chrome/browser/navigator.properties#35

(And file a bug on porting this fix to SeaMonkey, and perhaps have them start using the toolkit file rather using their own prefs: http://mxr.mozilla.org/seamonkey/source/toolkit/locales/en-US/chrome/global/intl.properties#32 )

None of this covers the cache menu, though. For that, had InitBrowserMenu and RefreshBrowserMenu (methods that populate the cache for "browser" charset menus) call RemoveFlaggedCharsets before InitCacheMenu, to match InitMoreMenu. Pretty sure this code is using strings inefficiently, and all that array copying can't be very good, but I'd rather not get into refactoring of this code. Also needed to have SetCurrentCharset refuse to add "notForBrowser" charsets.

Finally, since the UTF-7 menuitem won't show up in the menu at all, even when you're on a page that uses it, I had to update the code that updates the checkbox to have it always remove the old checkbox, regardless of whether we're checking a new checkbox. This means that the menu won't have any checked item on UTF-7 pages, but I don't think that's a problem.
Attachment #326767 - Attachment is obsolete: true
Attachment #326836 - Flags: superreview?(bzbarsky)
Attachment #326836 - Flags: review?(smontagu)
Attachment #326767 - Flags: superreview?(bzbarsky)
Attachment #326767 - Flags: review?(smontagu)
(In reply to comment #3)
> That's worth testing, but my earlier reading of the code made me believe that
> we didn't populate the cache from visited pages, only user overrides.  (It
> would be much less useful if it were populated from every page that had a
> charset declared.)

Turns out that that is the case... charsetLoadListener() in browser.js calls nsCharsetMenu::SetCurrentCharset, which updates the cache.

> If you find otherwise, we should fix that problem in a separate bug.  I'd be
> OK with teaching InitCacheMenu that trick, but I don't think it's necessary.

Didn't see your comment before patching... don't really mind splitting it off if we want to minimize changes.
Attachment #326836 - Flags: review?(smontagu) → review+
Blocks: xss
Whiteboard: [sg:want P3]
Comment on attachment 326836 [details] [diff] [review]
slightly more involved patch...

>+++ b/browser/base/content/browser.js
...
>+++ b/toolkit/content/charsetOverlay.js

Is there a bug on this code duplication?  If not, can it be filed, please?

>+++ b/xpfe/components/intl/nsCharsetMenu.cpp
>+  nsAutoString prop; prop.AssignWithConversion(".notForBrowser");
>+  res = RemoveFlaggedCharsets(decs, &prop);

Given what RemoveFlaggedCharsets actually does with its second argument, I think that second argument should be a |const nsString&|.  Then you can pass in an NS_LITERAL STRING in your two new callers and not have to mess around with charset conversions, etc.

sr=bzbarsky with those nits addressed
Attachment #326836 - Flags: superreview?(bzbarsky) → superreview+
This missed 1.9.0.1, but if the patch is ready, it should be nominated with rationale for 1.9.0.2
Flags: blocking1.9.0.1? → blocking1.9.0.1-
bz, can you just take a quick look over the string changes in nsCharsetMenu.cpp?
Attachment #326836 - Attachment is obsolete: true
Attachment #328061 - Flags: superreview?(bzbarsky)
Filed bug 443514 on the code duplication.
Comment on attachment 328061 [details] [diff] [review]
updated to comments

That looks great.
Attachment #328061 - Flags: superreview?(bzbarsky) → superreview+
The way I understand things... (please check and correct if needed).

We drop UTF-7 from menus where that has not been used, do not drop for
pages that are UTF-7 (e.g. because its headers said so). Then this does
not protect bug 408457: the attacker could start with an UTF-7 page.
(In reply to comment #12)
> We drop UTF-7 from menus where that has not been used, do not drop for
> pages that are UTF-7 (e.g. because its headers said so).

No, we drop the UTF-7 option from all menus, all the time. If the page specifies UTF-7, we'll still use it, but nothing will be checked in the menu.
Does that mean that if you switch from UTF-7 to something else, you'd be unable to return to UTF-7 even though the page specifies UTF-7?
Without reloading the page, yes.
Thanks for the explanation. Now I understand that
> ... nothing will be checked in the menu
means "present", not "ticked".

But then, comment #5 (and comment #13) says:
> ... the menu won't have any checked item
> on UTF-7 pages, but I don't think that's a problem.
thus comment #14 is moot. (I do not think that is a problem.)
Pushed to m-c:
http://hg.mozilla.org/mozilla-central/index.cgi/rev/e64a979ed64c

I just realized I'm going to have to update all of the localized intl.properties, since UTF-7 is included in all of those as well. Seems like some of the strings in that file shouldn't be in a localized file.
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
In bug 406777 comment 1, similar problems are mentioned with other encodings.
Did the removal of UTF-7 achieve anything, in the sense of helping bug 408457 ? (Jesse's example of "UTF-16 U+203C encoded as 0x203C" seems harmless enough
as most sites would filter out the 0x3C.)

See bug 406777 comment 16 also.
Attached patch 1.9 l10n patchSplinter Review
Attachment #328242 - Flags: review?(l10n)
Attachment #328241 - Attachment description: branch patch → 1.9 patch
Attachment #328242 - Attachment description: l10n patch → 1.9 l10n patch
(In reply to comment #18)
> In bug 406777 comment 1, similar problems are mentioned with other encodings.
> Did the removal of UTF-7 achieve anything, in the sense of helping bug 408457 ?

It removes the most well-known attack vector. If you're aware of other encodings that should be given the same treatment feel free to file new bugs. As you know there are other bugs filed on addressing these problems in a more general way.
Does this bug remove UTF-7 from
View, CharacterEncoding, CustomizeList
also? 
> (Jesse's example of "UTF-16 U+203C encoded as 0x203C" seems harmless
> enough as most sites would filter out the 0x3C.)

UTF-16 sites wouldn't, which is half of my point.
> UTF-16 sites wouldn't ...

Are there any? :-)

So, what is the holdup with 408457? If we could put in all the energy into
making this patch, get it reviewed and accepted (though is useless at best,
or drops wanted functionality), why cannot 408457 be "done"? Even BZ had
the time to comment here and not the temerity to raise objections...

Oh well, this is much better than nothing. Going about it the MS way, not
tackling the essence but the "known attack vector", while keeping things
dark and mysterious. (Sorry Gavin, having a bad day.)
> Are there any? :-)

Care to stop trolling?

> So, what is the holdup with 408457? 

Needs code analysis.  As I've repeatedly said in that bug.

For what it's worth, my total time spent on this bug is under 8 minutes including this comment.  And since this patch doesn't significantly complicate already-complicated code I had no objections to it as an sr (and it had module owner approval already, note).
Put another way, as far as I can tell neither patch "tackles the essence".  The other may (as in, I may be wrong), but some evidence is needed.
(In reply to comment #22)
> Does this bug remove UTF-7 from View, CharacterEncoding, CustomizeList also?

Yes.
Comment on attachment 328242 [details] [diff] [review]
1.9 l10n patch

r=me
Attachment #328242 - Flags: review?(l10n) → review+
Attachment #328241 - Flags: approval1.9.0.2?
Attachment #328242 - Flags: approval1.9.0.2?
Comment on attachment 328241 [details] [diff] [review]
1.9 patch

Approved for 1.9.0.2. Please land in CVS. a=ss

(And, any way to add a test for this? Or do we need UI-level tests? ... go go Gristmill.)
Attachment #328241 - Flags: approval1.9.0.2? → approval1.9.0.2+
Comment on attachment 328242 [details] [diff] [review]
1.9 l10n patch

Approved for 1.9.0.2. Please land in CVS. a=ss

(Please be sure to add the fixed1.9.0.2 keyword only after both patches have landed.)
Attachment #328242 - Flags: approval1.9.0.2? → approval1.9.0.2+
mozilla/toolkit/content/charsetOverlay.js 	1.13
mozilla/browser/base/content/browser.js 	1.1035
mozilla/intl/uconv/src/charsetData.properties 	1.48
mozilla/xpfe/components/intl/nsCharsetMenu.cpp 	1.48
mozilla/toolkit/locales/en-US/chrome/global/intl.properties 	1.8

Also checked in the l10n patch (l10n bonsai is really slow, though).
Keywords: fixed1.9.0.2
Target Milestone: --- → mozilla1.9.1a1
Gavin, can you please create a patch (en-US and l10n) for the 1.8 branch for this?
Flags: wanted1.8.1.x+
Flags: blocking1.8.1.17+
Sorry, I've been away for a while. I'll try to get this done tomorrow.
Attached patch 1.8 patchSplinter Review
Attachment #334359 - Flags: approval1.8.1.17?
Attachment #334360 - Flags: approval1.8.1.17?
Axel, the 1.8 triage team will go through and approve the en-US patch, but can you approve the l10n one?
I'm fine with carrying over my 1.9.0.x approval over to the 1.8 branch. As the l10n patch depends on the en-US one, I'd rather see that approval go first, but from my point of view, the two get approved as one as soon as en-US goes in.
Comment on attachment 334359 [details] [diff] [review]
1.8 patch

Approved for 1.8.1.17, a=dveditz for release-drivers.
Attachment #334359 - Flags: approval1.8.1.17? → approval1.8.1.17+
Comment on attachment 334360 [details] [diff] [review]
1.8 l10n patch

Approved for 1.8.1.17, a=dveditz for release-drivers.
Attachment #334360 - Flags: approval1.8.1.17? → approval1.8.1.17+
Verified, UTF-7 no longer appears as an item in the character encoding list in the latest build candidates for 20017/3.0.2 on Win/Mac/Linux on en-US and at least a couple of locales, pl and zh-CN for instance.

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.4; zh-CN; rv:1.9.0.2) Gecko/2008082909 Firefox/3.0.2

Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.17) Gecko/2008082909 Firefox/2.0.0.17
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: