Closed Bug 658952 Opened 15 years ago Closed 15 years ago

some korean(2byte character) broken when using euc-kr page encoding - there is some doubt about uscan.c

Categories

(Core :: Internationalization, defect)

x86
All
defect
Not set
major

Tracking

()

RESOLVED FIXED
Tracking Status
firefox5 - ---
firefox6 - fixed

People

(Reporter: oskmkr, Assigned: smontagu)

References

()

Details

(Keywords: regression)

Attachments

(5 files)

User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0) Build Identifier: Mozilla/5.0 (Windows NT 6.1; rv:2.0.1) Gecko/20100101 Firefox/4.0.1 uscan.c fixed below ---- 1. before else if(! in[1] & 0x80) /* 2nd byte range check */ 2. after else if(!(in[1] & 0x80)) /* 2nd byte range check */ but there is not equal because operator priority !! Reproducible: Always Actual Results: there is no broken character in euc-kr pages. o/s : windows 7 (kor) http://cafe.naver.com/oskmcafetest
I could not reproduce the issue on: Mozilla/5.0 (Windows NT 6.1; rv:6.0a1) Gecko/20110522 Firefox/6.0a1 *Note: Reporter, can you be more clear on the steps to reproduce? Thanks.
Relevant Bug 576556
captured reported issue.
(In reply to comment #1) > I could not reproduce the issue on: > Mozilla/5.0 (Windows NT 6.1; rv:6.0a1) Gecko/20110522 Firefox/6.0a1 > > *Note: Reporter, can you be more clear on the steps to reproduce? Thanks. I attached capture image of this issue. I wish that is helpful. Thanks.
Confirmed on http://hg.mozilla.org/releases/mozilla-2.0/rev/08cd5ee7b7c3 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0.2pre) Gecko/20110523 Firefox/4.0.2pre ID:20110523030626 http://hg.mozilla.org/releases/mozilla-aurora/rev/7d4b05121ada Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0.2) Gecko/20110508 Firefox/5.0a2 ID:20110523042001 http://hg.mozilla.org/mozilla-central/rev/11d04991cdd0 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0a1) Gecko/20110523 Firefox/6.0a1 ID:20110523030619 Regresson window(cached m-c hourly): Works: http://hg.mozilla.org/mozilla-central/rev/883651094520 Mozilla/5.0 (Windows; Windows NT 6.1; WOW64; en-US; rv:2.0b2pre) Gecko/20100705 Minefield/4.0b2pre ID:20100705123310 Fails: http://hg.mozilla.org/mozilla-central/rev/53dc58746f3e Mozilla/5.0 (Windows; Windows NT 6.1; WOW64; en-US; rv:2.0b2pre) Gecko/20100705 Minefield/4.0b2pre ID:20100705221203 Pushlog: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=883651094520&tochange=53dc58746f3e In local build(Windows7) build from 362e3ce08ead + backed out b18119895f4a : Works as expected build from 362e3ce08ead : Fails Regressed by: b18119895f4a Daniel Holbert — Bug 576556: Add GCC-suggested parens in uconv.c to fix build warning. r=smontagu
Assignee: nobody → smontagu
Blocks: 576556
Status: UNCONFIRMED → NEW
Component: General → Internationalization
Ever confirmed: true
Keywords: regression
OS: Windows 7 → All
Product: Firefox → Core
QA Contact: general → i18n
Attached file reduced html
Version: unspecified → Trunk
Attachment #534676 - Attachment mime type: text/html → text/html; charset=ksc5601
So, that cset was just: http://hg.mozilla.org/mozilla-central/rev/b18119895f4a which added a single set of parens that I thought GCC was suggesting ("suggest parentheses around operand of '!' or change '&' to '&&' or '!' to '~'") In my initial reading, I'd thought that message meant that the "&" was _part_ of the operand of "!". Hence, I stuck the close-paren after the end of the "&" expression. But now when re-reading that warning, I see that that my initial reading was incorrect (and in fact, given that this cset changed behavior, the & was definitely *not* part of the operand of "!"). So, the close-paren I added just needs shifting back to before the "&".
I'm happy to take this & clean up after myself, but I can't get to it until tomorrowish. (The fix itself is trivial, but I want to make sure we get a reftest working right for this, too.) smontagu (or anyone else), feel free to take this if you get to it first.
Attached file Reference html
There's something odd going on here -- your patch was the correct test according to the comment just above: * and the second byte can take any value with MSB = 1. ... but that is just wrong! The codepoint that is failing in the reduced testcase is 0xA45A. The original condition |if (! in[1] & 0x80)| is equivalent to |if (in[1] == 0)|, so we were basically never rejecting any invalid codepoints (at least in this routine -- they might have failed or been mapped to replacement characters somewhere else). Daniel, thanks for the offer to fix this, but I think I'd better take it and do an exhaustive check that all possible combinations are being correctly handled.
Attached patch PatchSplinter Review
This includes a fix for bug 462687 and a reftest. Todo: xpcshell test with full coverage.
Attachment #534860 - Flags: review?(VYV03354)
Attachment #534860 - Flags: review?(VYV03354) → review+
Release drivers are not going to track this for 5 or 6. If you want to make the case that it should land for either of those, please request approval on the patch with an explanation of the risks and benefits.
Status: NEW → RESOLVED
Closed: 15 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Comment on attachment 534860 [details] [diff] [review] Patch This is a significant regression from 1.9.2: all Korean characters in the windows CP949 extension to EUC-KR (a total of 1942 codepoints) are not displayed correctly. I've tested the patch with all possible double-byte combinations from 0x8001 to 0xFFFF, and there is no change in decoding between 1.9.2 and trunk with this patch, so I'm confident that it's a very safe change.
Attachment #534860 - Flags: approval-mozilla-beta?
Attachment #534860 - Flags: approval-mozilla-aurora?
Comment on attachment 534860 [details] [diff] [review] Patch too late for 5 Beta but we're OK with this going into Aurora for 6.
Attachment #534860 - Flags: approval-mozilla-beta?
Attachment #534860 - Flags: approval-mozilla-beta-
Attachment #534860 - Flags: approval-mozilla-aurora?
Attachment #534860 - Flags: approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: