Closed
Bug 511859
Opened 15 years ago
Closed 15 years ago
Utf8ToOneUcs4Char in jsstr.cpp ,overlong UTF-8 seqence detection problem.
Categories
(Core :: JavaScript Engine, defect, P1)
Core
JavaScript Engine
Tracking
()
VERIFIED
FIXED
Tracking | Status | |
---|---|---|
status1.9.2 | --- | beta3-fixed |
status1.9.1 | --- | wontfix |
People
(Reporter: masa141421356, Assigned: masa141421356)
References
()
Details
(Whiteboard: [sg:want?] fixed-in-tracemonkey)
Attachments
(4 files, 2 obsolete files)
1.52 KB,
text/html
|
Details | |
5.31 KB,
patch
|
Waldo
:
review+
|
Details | Diff | Splinter Review |
5.05 KB,
text/plain
|
Details | |
888 bytes,
patch
|
Details | Diff | Splinter Review |
see http://mxr.mozilla.org/mozilla-central/source/js/src/jsstr.cpp#5132 > /* from Unicode 3.1, non-shortest form is illegal */ > static const uint32 minucs4Table[] = { > 0x00000080, 0x00000800, 0x0001000, 0x0020000, 0x0400000 > }; It should be as following (three zeros are missing) /* from Unicode 3.1, non-shortest form is illegal */ static const uint32 minucs4Table[] = { 0x00000080, 0x00000800, 0x00010000, 0x00200000, 0x04000000 };
Assignee | ||
Comment 1•15 years ago
|
||
Attachment #395816 -
Attachment is obsolete: true
Comment 2•15 years ago
|
||
For the record, ECMA-262 ed. 3 required that overlong sequences be accepted. ES5 changes to the now-standard position of forbidding overlong sequences.
Whiteboard: [good first bug]
Assignee | ||
Comment 3•15 years ago
|
||
Assignee: general → masa141421356
Status: NEW → ASSIGNED
Assignee | ||
Updated•15 years ago
|
Attachment #395934 -
Flags: review?(jwalden+bmo)
Assignee | ||
Updated•15 years ago
|
OS: Windows XP → All
Hardware: x86 → All
Assignee | ||
Comment 4•15 years ago
|
||
"Rhino" does not have same problem (see also : bug 173180).
Comment 5•15 years ago
|
||
Comment on attachment 395934 [details] [diff] [review] PAtch rev.1.0 >diff --git a/js/src/jsstr.cpp b/js/src/jsstr.cpp >--- a/js/src/jsstr.cpp >+++ b/js/src/jsstr.cpp >@@ -5124,17 +5124,17 @@ js_OneUcs4ToUtf8Char(uint8 *utf8Buffer, > */ > static uint32 > Utf8ToOneUcs4Char(const uint8 *utf8Buffer, int utf8Length) > { > uint32 ucs4Char; > uint32 minucs4Char; > /* from Unicode 3.1, non-shortest form is illegal */ > static const uint32 minucs4Table[] = { >- 0x00000080, 0x00000800, 0x0001000, 0x0020000, 0x0400000 >+ 0x00000080, 0x00000800, 0x00010000, 0x00200000, 0x04000000 > }; > > JS_ASSERT(utf8Length >= 1 && utf8Length <= 6); > if (utf8Length == 1) { > ucs4Char = *utf8Buffer; > JS_ASSERT(!(ucs4Char & 0x80)); > } else { > JS_ASSERT((*utf8Buffer & (0x100 - (1 << (7-utf8Length)))) == This patch is sufficient to fix the problem according to ECMAScript-262 ed. 3, to be sure. However, I'm not sure, at this point, that that's sufficient for an acceptable fix, given that ES5 is very nearly out and requires rejection of overlong sequences, and that the necessary changes to address that are not large in scope. Would you mind creating a patch which addresses those problems? Minimally, I see the length of the array here, the 6 and 7 in the assertions in context, and the 6 in Decode -- I'd appreciate it if you could look and see if there are any others that need addressing. Also, if I could trouble you to do so, adding a test or two to TestUTF8 in js/src/shell/js.cpp that will catch overlong sequences would be appreciated. Thanks!
Comment 6•15 years ago
|
||
Comment on attachment 395934 [details] [diff] [review] PAtch rev.1.0 Removing review request in case the previous comment went unnoticed...
Attachment #395934 -
Flags: review?(jwalden+bmo)
Comment 7•15 years ago
|
||
We often make progress (esp. with contributed patches -- thanks to the reporter indeed for this one) by taking a bird in the hand and doing a followup bug to get the second in the bush. /be
Assignee | ||
Comment 10•15 years ago
|
||
Comment on attachment 395934 [details] [diff] [review] PAtch rev.1.0 I wrote unified patch for this bug and bug 514760 and bug 518347. And now testing.
Attachment #395934 -
Attachment is obsolete: true
Assignee | ||
Comment 11•15 years ago
|
||
This patch contains fix of bug 518347 and bug 514760. Memo: This patch causes to test failure of js1_5/Regress/regress-172699.js but it is by design.
Attachment #403067 -
Flags: review?(jwalden+bmo)
Comment 12•15 years ago
|
||
Masahiro, please fix or disable the test as part of the checkin.
Updated•15 years ago
|
Attachment #403067 -
Flags: review?(jwalden+bmo) → review+
Comment 13•15 years ago
|
||
Comment on attachment 403067 [details] [diff] [review] Patch rev.1.1 >diff --git a/js/src/jsstr.cpp b/js/src/jsstr.cpp >+#define OVERLONG_UTF8 ((uint32)0xFFFFFFFF) For future reference, we're using C++ const values rather than defines these days -- better typing and all that. >- if (ucs4Char < minucs4Char || >- ucs4Char == 0xFFFE || ucs4Char == 0xFFFF) { >+ if (ucs4Char < minucs4Char) { >+ ucs4Char = OVERLONG_UTF8; >+ } else if (ucs4Char == 0xFFFE || ucs4Char == 0xFFFF) { > ucs4Char = 0xFFFD; > } Seems like a good idea to add a JS_UNLIKELY() around the overlong comparison as a helper for compilation without PGO. I'll push this with nits fixed to TM later tonight -- definite fodder for 192 as well, I think. Thanks!
Comment 14•15 years ago
|
||
Hum. So, looking and thinking about the existing behavior not changed by this patch, I don't see the U+FFF(EF)-to-replacement behavior in the spec. Bug 172699 added it when doing the prior-to-this-patch decode-overlong-to-replacement behavior. Should we implement this extra conversion not specified by ES5? It's not clear, certainly not without any motivating help from bug comments. Filed bug 520095 to address this, even if addressing is simply adding a comment next to those lines saying why the deviation from spec exists.
Assignee | ||
Comment 15•15 years ago
|
||
This test fails on unpatched browser, but not fails on patched edition. # This test does not contain test of changes on Ucs4 to UTF-8 convertion. # it is reason why I don't set "review?" flag to this.
Comment 16•15 years ago
|
||
I was going to make these minimal changes to get the test to work again, but yours are much better, touching more of the different ways the patch here would change behavior.
Comment 17•15 years ago
|
||
...actually, on second look, it probably makes sense to add both your test and make the changes I'd made. Did that, with a few formatting tweaks to your test (we surround binary operators with a space on each side, so |expect = foo| rather than |expect=foo|), and pushed to TM (also with appropriate changes to the relevant jstests.list file): http://hg.mozilla.org/tracemonkey/rev/8bb236bece1f Thanks again!
Whiteboard: [good first bug] → fixed-in-tracemonkey
Comment 18•15 years ago
|
||
Accepting overlong UTF-8 sequences is a well-known bug with potential security implications. In the context of JavaScript those implications are likely less serious than in, say, C code, but it still seems important to fix this. If this is to be fixed for 1.9.2 (I think it should given the security considerations), I think we can all agree it must be fixed for the beta.
Flags: blocking1.9.2?
Whiteboard: fixed-in-tracemonkey → [sg:want?] fixed-in-tracemonkey
Updated•15 years ago
|
Flags: blocking1.9.2? → blocking1.9.2+
Updated•15 years ago
|
Priority: -- → P2
Updated•15 years ago
|
Priority: P2 → P1
Comment 20•15 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/8bb236bece1f
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Comment 23•15 years ago
|
||
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/e42c563313a0
status1.9.2:
--- → final-fixed
Updated•13 years ago
|
status1.9.1:
--- → wontfix
You need to log in
before you can comment on or make changes to this bug.
Description
•