Closed Bug 131388 Opened 23 years ago Closed 23 years ago

EUC-KR/UHC decoders should be combined into one

Tracking

()

Status:

VERIFIED FIXED

Milestone:

mozilla1.0

People

(Reporter: isaachh, Assigned: jshin1987)

Details

(Keywords: intl, Whiteboard: done)

Attachments

(7 files, 5 obsolete files)

Testcase HTML 23 years ago Isaac Hwak Han 436 bytes, text/html		Details
Testcase Result of Mozilla 23 years ago Isaac Hwak Han 5.70 KB, image/png		Details
Testcase Result of IE 23 years ago Isaac Hwak Han 12.07 KB, image/png		Details
test HTML file 23 years ago Yuying Long 591 bytes, text/html		Details
testcase : to demonstrat 'self-recovering ability' of mozilla 23 years ago Jungshik Shin 529 bytes, text/html; charset=euc-kr		Details
a patch to combine EUC-KR decoder and CP949 decoder 23 years ago Jungshik Shin 4.55 KB, patch		Details \| Diff \| Splinter Review
patch to combine Big5 and Big5-HKSCS decoders into one 23 years ago Jungshik Shin 2.71 KB, patch		Details \| Diff \| Splinter Review
a cleaned-up patch to combine EUC-KR decoder and CP949 decoder 23 years ago Jungshik Shin 5.49 KB, patch		Details \| Diff \| Splinter Review
another patch with a diff. approach but the same effect 23 years ago Jungshik Shin 5.49 KB, patch		Details \| Diff \| Splinter Review
bas. same as the prev. one with CP949 correction 23 years ago Jungshik Shin 6.01 KB, patch	tetsuroy : review+	Details \| Diff \| Splinter Review
New testcase mail message 23 years ago Isaac Hwak Han 2.02 KB, application/octet-stream		Details
bas. the same patch as before but preempting a potential sr comment 23 years ago Jungshik Shin 6.05 KB, patch	alecf : superreview+ scc : approval+	Details \| Diff \| Splinter Review

Isaac Hwak Han

Reporter

Description

•

23 years ago

Microsoft Windows has an ability to display non-EUC-KR Korean character (such as U+C0FE) with its Unicode encoded font (e.g. the default Korean Windows font "Gulim"). But mozilla simply garbles when it encounters such character in a Web page. By "garble", I mean the characters are displayed with several some unrecognizable strange symbol character. Surely, it should be illegal to use non-EUC-KR character in a page with EUC-KR encoding, but many Korean web pages and mail messages has some spelling typos whose right form should be deduceable by their readers.

Rui Xu

Updated

•

23 years ago

Keywords: intl

QA Contact: ruixu → ylong

Yuying Long

Comment 1

•

23 years ago

Is there any test case? thanks!

Isaac Hwak Han

Reporter

Comment 2

•

23 years ago

Attached file Testcase HTML — Details

Isaac Hwak Han

Reporter

Comment 3

•

23 years ago

Attached image Testcase Result of Mozilla — Details

Mozilla Build ID: 2002031203 Win2K Professional

Isaac Hwak Han

Reporter

Comment 4

•

23 years ago

Attached image Testcase Result of IE — Details

Microsoft IE v5.5 Win2K Professional

Yuying Long

Comment 5

•

23 years ago

By running the reporter's test case I got the same result as above. Confirming. However, this morning I copied/pasted this unicode into netscape from: http://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=C0FE&useutf8=false I got the character glyph a little different than original one, but not garbled. And I copied/pasted the string from IE to Netscape, got the same result.

Status: UNCONFIRMED → NEW

Ever confirmed: true

Yuying Long

Comment 6

•

23 years ago

Attached file test HTML file — Details

The steps that I created this html file: 1. Save the reporter's test case as a local file, open it in browser, got the same result as reporter. 2. Load the reporter's test case on IE. 3. The page was displayed fine, and then copy/paste the problematic character/string from IE, replace them in Netscape. 4. Save the file and bring it into browser again. Result: The character glyph is a little different but no garbled.

Yuying Long

Comment 7

•

23 years ago

Where did you create your document or test case? My test case is not showing correctly in IE. Seems created in different application will get different result between IE and Netscape.

Isaac Hwak Han

Reporter

Comment 8

•

23 years ago

I create the testcase HTML in Notepad on PC running Win2K Prof. (Korean localized version). The save option "encoding" is ANSI, the default. Your attachment (id=74843) HTML showed no garble, but instead showed different character (U+C0F7) instead of intended (U+C0FE). Also opening it with Notepad revealed three jamo components of U+C0FE are not conjoined to form a single Korean character.