Closed Bug 182751 Opened 22 years ago Closed 22 years ago

Mozilla accepsts malformed UTF-8 byte sequences

Categories

(Core :: Security, defect)

Product:

Component:

Platform:

x86

Windows 2000

Type:

defect

Priority:

Not set

Severity:

normal

Tracking

()

Status:

RESOLVED FIXED

People

(Reporter: alexeyc2003, Assigned: smontagu)

References

(
URL
)

Details

(Keywords: intl, testcase, topembed+)

Attachments

(2 files, 2 obsolete files)

Patch 22 years ago Simon Montagu :smontagu 10.40 KB, patch		Details \| Diff \| Splinter Review
The same, diff -uw 22 years ago Simon Montagu :smontagu 6.31 KB, patch	smontagu : review-	Details \| Diff \| Splinter Review
New patch 22 years ago Simon Montagu :smontagu 11.78 KB, patch		Details \| Diff \| Splinter Review
The last patch, diff -uw 22 years ago Simon Montagu :smontagu 7.64 KB, patch	ftang : review+ hjtoi-bugzilla : superreview+	Details \| Diff \| Splinter Review

Alexey Chernyak

Reporter

Description

•

22 years ago

OS: Win2K Build: 2002111904 (Trunk) http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt This URL contains UTF-8 decoder capability and stress test. Mozilla seem to fail a maximum overlong sequences test 4.2.3 and all tests in section 5 - illegal code positions. The document also briefly explains security implications of accepting overlong and malformed sequences. You can use this tool for testing wellformedness and round trip of UTF-8: http://www.macchiato.com/unicode/convert.html Also the following bugs seem to be related: Bug 86411 Bug 172701 Bug 174351

Boris Zbarsky [:bzbarsky]

Updated

•

22 years ago

Depends on: 86411, 172701, 174351

Comment 1

•

22 years ago

'Me Too' - Status update. Testing against the UTF-8 stress test with mozilla-1.3a (Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3a) Gecko/20021213), 4.2.3 looks to be 'fixed' (at least, working), but 2.1.3 looks broken, as do 2.3.x and all of the 5.x tests. This is a pretty important security issue and needs to be fixed before the code gets into an enormous installed base (we hope) who never upgrade...

Mitchell Stoltz (not reading bugmail)

Comment 2

•

22 years ago

Over to Simon - we should run these tests and act on the results.

Assignee: mstoltz → smontagu

Mitchell Stoltz (not reading bugmail)

Updated

•

22 years ago

Group: security

Simon Montagu :smontagu

Assignee

Comment 3

•

22 years ago

Accepting.

Status: NEW → ASSIGNED

Depends on: 191483

Simon Montagu :smontagu

Assignee

Comment 4

•

22 years ago

There are two problems: a typo at http://lxr.mozilla.org/seamonkey/source/intl/uconv/src/nsUTF8ToUnicode.cpp#185 // from Unicode 3.1, non-shortest form is illegal if(((2==mBytes) && (mUcs4 < 0x0080)) || ((3==mBytes) && (mUcs4 < 0x0800)) || ((4==mBytes) && (mUcs4 < 0x1000)) || // ^^^^^^ -- should be 0x10000 (5==mBytes) || (6==mBytes)) and because this whole block is in the |else| of |if(mUcs4 >= 0x00010000)| at http://lxr.mozilla.org/seamonkey/source/intl/uconv/src/nsUTF8ToUnicode.cpp#173 some cases are never getting tested.

Simon Montagu :smontagu

Assignee

Comment 5

•

22 years ago

Attached patch Patch (obsolete) — Details — Splinter Review

Fix the problems described in the last comment, add tests for UTF-16 encoded as UTF-8 and other illegal characters, and general cleanup.

Simon Montagu :smontagu

Assignee

Comment 6

•

22 years ago

Attached patch The same, diff -uw (obsolete) — Details — Splinter Review

Simon Montagu :smontagu

Assignee

Updated

•

22 years ago

Attachment #115576 - Flags: review?(ftang)

Simon Montagu :smontagu

Assignee

Updated

•

22 years ago

Keywords: topembed

Charles Rosendahl

Comment 7

•

22 years ago

plussing per edt triage

Keywords: topembed → topembed+

Simon Montagu :smontagu

Assignee

Comment 8

•

22 years ago

Attached patch New patch — Details — Splinter Review

Removed the test for U+FFFE and U+FFFF per bug 172701 comment 6, cleaned up a bit more and expanded comments.

Attachment #115575 - Attachment is obsolete: true

Simon Montagu :smontagu

Assignee

Comment 9

•

22 years ago

Attached patch The last patch, diff -uw — Details — Splinter Review

Attachment #115576 - Attachment is obsolete: true

Simon Montagu :smontagu

Assignee

Comment 10

•

22 years ago

Comment on attachment 115576 [details] [diff] [review] The same, diff -uw transferring review request

Attachment #115576 - Flags: review?(ftang) → review-

Simon Montagu :smontagu

Assignee

Updated

•

22 years ago

Attachment #116491 - Flags: review?(ftang)

Comment 11

•

22 years ago

Comment on attachment 116491 [details] [diff] [review] The last patch, diff -uw looks accroding to the new definitation unicode 3.2

Attachment #116491 - Flags: review?(ftang) → review+

Simon Montagu :smontagu

Assignee

Updated

•

22 years ago

Attachment #116491 - Flags: superreview?(heikki)

Heikki Toivonen (remove -bugzilla when emailing directly)

Comment 12

•

22 years ago

Comment on attachment 116491 [details] [diff] [review] The last patch, diff -uw I don't claim to understand this Unicode checking code, so my sr is purely a syntax/style-based sr. Please test with every testcase you can get your hands on... The only thing I could notice that needs fixing is the indentation, inconsistent space around operators & tab usage. Please replace all tabs in the file with 2 spaces, and make sure indentation still looks ok.

Attachment #116491 - Flags: superreview?(heikki) → superreview+

Simon Montagu :smontagu

Assignee

Comment 13

•

22 years ago

The indentation appears wonky in the diff -uw because it's wonky in the original file, but it looks all OK to me in my locak file and attachment 116490 [details] [diff] [review].

Simon Montagu :smontagu

Assignee

Comment 14

•

22 years ago

I have been testing with http://jazz.mcom.com/users/smontagu/publish/dynamicutf8test.html, which is a version of the URL given here. I'm now looking for some more testcases to run before I check in.

Heikki Toivonen (remove -bugzilla when emailing directly)

Comment 15

•

22 years ago

I can see even from the diff that there are tabs!

Simon Montagu :smontagu

Assignee

Comment 16

•

22 years ago

Also tested by correlating http://www.iur.ut.ee/juridica/utf8-3.htm with the table of legal UTF-8 sequences in http://www.unicode.org/reports/tr28/, and looking at lots of utf-8 pages in different languages to see that they are still displayed OK. I think we are good to go with this.

Simon Montagu :smontagu

Assignee

Comment 17

•

22 years ago

Fix checked in.

Status: ASSIGNED → RESOLVED

Closed: 22 years ago

Resolution: --- → FIXED

Alexey Chernyak

Reporter

Updated

•

21 years ago

No longer depends on: 174351

Alexey Chernyak

Reporter

Updated

•

21 years ago

Blocks: 86411

No longer depends on: 86411

Daniel Veditz [:dveditz]

Comment 18

•

21 years ago

Clearing security flag: fix was checked in months ago and there was never a demonstrated exploit of the potential to abuse over-long UTF8 encodings.

Group: security

You need to log in before you can comment on or make changes to this bug.