Closed Bug 505991 Opened 16 years ago Closed 14 years ago

Location bar should not decode precent encoded overlong UTF-8 sequence

Categories

(Firefox :: Address Bar, defect)

defect
Not set
normal

Tracking

()

VERIFIED FIXED
Firefox 8
Tracking Status
firefox8 --- fixed
firefox9 --- fixed
firefox10 --- fixed
status1.9.2 --- wontfix
blocking1.9.1 --- -
status1.9.1 --- wanted

People

(Reporter: masa141421356, Assigned: masa141421356)

References

()

Details

(Keywords: verified-aurora, verified-beta, Whiteboard: [sg:low] [qa!])

Attachments

(2 files, 1 obsolete file)

If URL contains percent encoded invalid UTF-8 byte sequence, location bar converts it as U+FFFD using function losslessDecodeURI(aURI) in "browser.js". But U+FFFD means decoding failure. If decoded result contains U+FFFD, we should use original URL string, not decoded result. It can be used to spoof URL. For example: Steps to reproduce 1. Set "ja-JP" to browser's default accetable language 2. Go to "http://www.google.co.jp/search?&q=%E0%80%A2" 3. It will show serach result of "烙「". ("烙「" is result of decoding byte sequence [E0][80][A2] by Shift_JIS) 4.Copy URL displayed in Location Bar 5.Paste it to other text editor Expected result Pasted text is "http://www.google.co.jp/search?&q=%E0%80%A2" Actual Result: Pasted result is "http://www.google.co.jp/search?&q=%EF%BF%BD" (%EF%BF%BD is percent encoded UTF-8 representation of U+FFFD)
Attachment #390221 - Flags: review?(gavin.sharp)
This will need a test too.
Flags: in-testsuite?
blocking1.9.1: --- → ?
Flags: blocking1.9.0.13?
How is this a spoofing security problem? Your description sounds more like copy/paste dataloss, I don't see how anyone is fooled about anything.
Whiteboard: [sg:needinfo]
If we block on this for 1.9.0.13, we'll block on it for 1.9.1.3, but we really need an answer to comment 3 first.
status1.9.1: --- → ?
Flags: wanted1.9.0.x?
(In reply to comment #3) > How is this a spoofing security problem? Your description sounds more like > copy/paste dataloss, I don't see how anyone is fooled about anything. No, it is not only copy/paste problem. Is it example. This is data loss of decodeURI(). Almostly, decodeURI throws exception when string does not seems to be UTF-8. But, if byte sequence matches some malformed multibyte representation (It is not prohibited at RFC 2279, but prohibited at RFC 3929), decodeURI() decodes it as U+FFFD, not throwing exception. list of malformed otetects Char. number range | UTF-8 octet sequence (hexadecimal) | (binary) --------------------+--------------------------------------------- 0000 0000-0000 007F | 1100000x 10xxxxxx 0000 0000-0000 07FF | 11100000 100xxxxx 10xxxxxx 0000 0800-0000 FFFF | 1110xxxx 10xxxxxx 10xxxxxx 0000 0000-0000 FFFF | 11110000 10xxxxxx 10xxxxxx 10xxxxxx This problem can also break History, Autocomplete, Text editing on location bar. And, At Trunk, It might be better to fix decodeURI() of JavaScript to throw exeception.
This problem depends on How decodeURI() handle overlong UTF-8 sequences. When broken UTF-8 sequence other than overlong sequences, decodeURI() throws exception. But, if broken UTF-8 sequence is overlong sequence, it is decoded as U+FFFD. (See also bug 172699.) This problem breaks autocomplete of locationbar because autocomplete uses decoded URL, not original URL. after visiting http://www.google.co.jp/search?&q=%E0%80%A2 , when typing "http://www.google.co.jp/search?&q=%E0%80", autocomplete shows "http://www.google.co.jp/search?&q=%E0%80%A2", but, completed URL is decoded URL. It is "http://www.google.co.jp/search?&q=%EF%BF%BD" So, attacker can spoof URL using autocomplete.
"spoofing" seems too strong a term if you're limited to uFFFD characters. We'll approve a reviewed patch on the branch to fix the dataloss but not blocking.
Flags: wanted1.9.0.x?
Flags: wanted1.9.0.x+
Flags: blocking1.9.0.13?
Flags: blocking1.9.0.13-
Comment on attachment 390221 [details] [diff] [review] patch rev.1.0 for Trunk comment should be: // Presence of the replacement character means the decoding failed Looks OK to me, but lets see what Dao has to say.
Attachment #390221 - Flags: review?(gavin.sharp) → review?(dao)
blocking1.9.1: ? → -
Attachment #390221 - Flags: review?(dao) → review+
Comment on attachment 390221 [details] [diff] [review] patch rev.1.0 for Trunk >+ //Exissance of replacement character, means failure of decode. Please rewrite as suggested by Gavin (including the space after //). And if you think decodeURI should throw instead of using the replacement character, please file a bug on that and add the bug number to this code comment. >+ if (value.indexOf("\ufffd")>=0) { >+ value = aURI.spec >+ } Remove { and }, add a space before and after >=, remove a space before "value = aURI.spec" and append a semicolon.
Attached patch fixed patch (obsolete) — Splinter Review
Comment on attachment 390630 [details] [diff] [review] fixed patch >+ if (value.indexOf("\ufffd")>=0) spaces around >= are still missing
Attachment #390630 - Attachment is obsolete: true
Hate feeling like I'm in an echo chamber, but test *please*? This won't make it on branch without one I bet...
Summary: URL spoofing using percent encoded illegal UTF-8 byte sequence → Location bar should not decode precent encoded overlong UTF-8 sequence
Can this land?
Assignee: nobody → masa141421356
Some overlong UTF-8 sequence is not converted to U+FFFD. It is also needed to fix bug 511859.
Depends on: 511859
As result of fix of bug 511859, left problems are only two patterns (%EF%BF%BE = U+FFFE, %EF%BF%BF = U+FFFF). At current implementation, decodeURI decodes them to U+FFFD. If decodeURI() decodes them as U+FFFE and U+FFFF, patch is not needed. (See also :bug 520095)
Depends on: 520095
> > If decodeURI() decodes them as U+FFFE and U+FFFF, patch is not needed. > (See also :bug 520095) Oops, What I want to say is, If decodeURI() decodes them as U+FFFE and U+FFFF, current patch will be obsolete. But,I don't know how location bar will show U+FFFE and U+FFFF. If showing U+FFFE and U+FFFF causes another problem , new patch will be needed.
Now all cases are fixed by bug 511859 and bug 520095. -->FIXED.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
This bug (or rather 520095) does not appear critical enough to take the chance of regression in 3.6. status1.9.2->wontfix
Group: core-security
Whiteboard: [sg:needinfo] → [sg:low]
Target Milestone: --- → Firefox 8
Group: core-security
Group: core-security
The issue is no longer reproducible on: Mozilla/5.0 (Windows NT 5.1; rv:11.0a1) Gecko/20111113 Firefox/11.0a1 Mozilla/5.0 (Windows NT 5.1; rv:10.0a2) Gecko/20111113 Firefox/10.0a2 Mozilla/5.0 (Windows NT 5.1; rv:9.0) Gecko/20100101 Firefox/9.0 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:9.0) Gecko/20100101 Firefox/9.0 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:10.0a2) Gecko/20111113 Firefox/10.0a2 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:11.0a1) Gecko/20111113 Firefox/11.0a1 Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:9.0) Gecko/20100101 Firefox/9.0 Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:10.0a2) Gecko/20111113 Firefox/10.0a2 Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:11.0a1) Gecko/20111113 Firefox/11.0a1 Mozilla/5.0 (X11; Linux x86_64; rv:11.0a1) Gecko/20111113 Firefox/11.0a1 Mozilla/5.0 (X11; Linux x86_64; rv:10.0a2) Gecko/20111114 Firefox/10.0a2 Mozilla/5.0 (X11; Linux x86_64; rv:9.0) Gecko/20100101 Firefox/9.0 IMO this is verified fixed.
Marking VERIFIED per previous comment.
Status: RESOLVED → VERIFIED
Whiteboard: [sg:low] → [sg:low] [qa!]
As result of bug 511859, bug 660612, all known problem of invalid UTF-8 is fixed. I agree to mark this bug as VERIFIED.
Is this the right behavior - bug 750266 ?
(In reply to Paul Silaghi [QA] from comment #23) > Is this the right behavior - bug 750266 ? I think , bug 750266 is Copy & Paste problem. Target of this bug is INVALID UTF-8 sequence, not valid UTF-8 sequence. URL written in bug 750266 contains only valid UTF-8 sequence.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: