Closed Bug 116882 Opened 23 years ago Closed 23 years ago

A middle dot character is not displayed on this page

Categories

(Core :: Internationalization, defect, P3)

Product:

Component:

Platform:

x86

Windows 2000

Type:

defect

Priority:

P3

Severity:

normal

Tracking

()

Status:

VERIFIED FIXED

Milestone:

mozilla0.9.9

People

(Reporter: momoi, Assigned: ftang)

References

(
URL
)

Details

(Keywords: intl)

Attachments

(2 files, 1 obsolete file)

This is an image which points to the problem character. Compare this to NN4 or IE 5/6 23 years ago Katsuhiko Momoi 111.67 KB, image/jpeg		Details
patch v1 23 years ago Frank Tang 4.08 KB, patch	shanjian : review+	Details \| Diff \| Splinter Review
patch v2 23 years ago Frank Tang 4.07 KB, patch	nhottanscp : review+ kinmoz : superreview+ roc : approval+	Details \| Diff \| Splinter Review

Katsuhiko Momoi

Reporter

Description

•

23 years ago

** Observed with 2001-12-22 Win32 trunk build ** On the above page, there is one character which is not displayed properly with Mozilla under Shift_JIS encoding. It looks like the character has the codepoint 0x81. (There is a similar bug filed -- Bug 116880. But in that bug the codepoint for the problem character is 0x86 0xA6.) Neither NN4 nor IE 5.5. has a propblem in displaying this character.

Teruko Kobayashi

Updated

•

23 years ago

Keywords: intl, nsbeta1

Comment 1

•

23 years ago

over to Mr.Li.

Assignee: yokoyama → shanjian

Comment 2

•

23 years ago

The character in question is 0x81, which is followed by 0x20. 0x8120 is not a legal sjis byte sequence. It is very strange to see that both IE and Netscape4.x replace such sequence to 0x8145, which is middle-dot. But anyway, I don't think this is a mozilla problem. I believe mozilla's behavior is better than both IE and Netscape4.x. Why replace illegal byte sequence to 0x8145? (I tried another byte sequence 0x8136, which was also replaced by 0x8145.)

Status: NEW → RESOLVED

Closed: 23 years ago

Resolution: --- → WORKSFORME

Assignee

Comment 3

•

23 years ago

sorry, I cannot tell which character you refere to.

Katsuhiko Momoi

Reporter

Comment 4

•

23 years ago

> I believe mozilla's behavior is better than both IE > and Netscape4.x. Why replace illegal byte sequence to 0x8145? Windows applications when they use Windows OS converters map this codepoint to the middle dot character. I am sorry but this is expected on Windows. The character is apparently fairly widely used -- right or wrong. If you use Notepad, Word, and other Windows applications, you see the same character, not "not found" character as we do on Mozilla. How are we going to convince Windows users that what they see in every other application is wrong? Let me re-open this for re-consideration and let me provide additional facts. ftang: If you want to see which character we are referring to, just open the URL with Mozilla and compare it with NN4 or IE5/6. You will see one character with a question mark with Mozilla but expressed with a middle-dot character in other browsers and applications.

Status: RESOLVED → REOPENED

Resolution: WORKSFORME → ---

Comment 5

•

23 years ago

Kat, I am not convinced yet. Is this kind of practice common? Did user do this intentionally? I mean when they put 0x81, what they want is mid dot? If MS just take 0x81 and map it to mid dot, that will be easy to understand it as a "feature". But to map a range of code points to one character does not make much sense. Can you tell me how such page is created?

Assignee

Comment 6

•

23 years ago

momoi, please attach a screen shot here (and circle with mark) . I cannot see that ? mark.

Katsuhiko Momoi

Reporter

Comment 7

•

23 years ago

In comment 5, shanjian said: > Is this kind of practice common? Did user do this intentionally? ... >Can you tell me how such page is created? Yes, this is the question we should be asking before we decide on this bug. Let me dig around a bit more before making a decision one way or the other. I suspect this is an intentional character.

Katsuhiko Momoi

Reporter

Comment 8

•

23 years ago

Attached image This is an image which points to the problem character. Compare this to NN4 or IE 5/6 — Details

Assignee

Comment 9

•

23 years ago

let's try to fix SJIS to Unicode conversion to map 0x8120 to U+30fb so we have backward compatability ? reassign back to ftang and mark it as M1.0

Assignee: shanjian → ftang

Status: REOPENED → NEW

Target Milestone: --- → mozilla1.0

Comment 10

•

23 years ago

As I mentioned in my previous comment, at least 0x8120 and 0x8136 are mapped to u30fb. I believe all characters in 0x8120 to 0x813f are mapped to u30fb, probably even larger. Adding such nonsense conversion just for this page does not make any sense, unless momoi's investigation show that this is a common practice and many webpages are doing it. In our charset detector, 0x8120 to 0x813f are illegal byte sequence. That may confuse some users when they switch detector on and off.

Comment 11

•

23 years ago

nsbeta1+ per i18n triage

Keywords: nsbeta1 → nsbeta1+

Assignee

Comment 12

•

23 years ago

let's fix this.

Status: NEW → ASSIGNED

Assignee

Comment 13

•

23 years ago

p3

Priority: -- → P3

Assignee

Comment 14

•

23 years ago

move to m0.9.9

Target Milestone: mozilla1.0 → mozilla0.9.9

Assignee

Comment 15

•

23 years ago

let's merge this bug into 116882. basically , we want compatible with IE6 on error handling to reduce risk of site compatability. What I found by looking at IE6 is the following a. IE6 treat 0xfd - 0xff as single byte. and convert them into f8f1-f8f3. We currently treat it as 2 bytes characters and convert to fffd b. if a lead byte is legal shift jis range but the 2nd byte are illegal range, IE 6 treat it as a two byte characters and convert to 30fb. we currently treat it as single byte character and convert it to 0xfffd c. for valid shift jis , if a character have no definitation . IE6 map it ot 30fb but we map to fffd we need to fix all the three above so we have IE6 parity in error handling. also, I wrote a cgi which generate legal shift according to the Nadin book also invalide shift jis. I post in http://warp/u/ftang/utf8test/sjis.cgi I will try to push it out to http://people.netscape.com/ftang/testscript/sjis/sjis.cgi

Assignee

Comment 16

•

23 years ago

*** Bug 116880 has been marked as a duplicate of this bug. ***

Assignee

Comment 17

•

23 years ago

Attached patch patch v1 (obsolete) — Details — Splinter Review

Assignee

Comment 18

•

23 years ago

add nhotta and shanjian to the list.

Comment 19

•

23 years ago

Comment on attachment 70437 [details] [diff] [review] patch v1 r=shanjian, (I suggest to remove the break in original line 147.)

Attachment #70437 - Flags: review+

Comment 20

•

23 years ago

+ // IE convert fc-ff as single byte and convert to + // U+f8f1 to U+f8f3 + if((0xfd == *src) || (0xfe == *src) || (0xff == *src)) + { + *dest++ = (PRUnichar) 0xf8f1 + + (*src - (unsigned char)(0xfd)); Does this mean, mapping like this? 0xfd -> 0xf8f1 0xfe -> 0xf8f2 0xff -> 0xf8f3 But the comment says fc-ff (includes fc). So is the IE6 behavior to map 0x30fb (the case c) specific to Shift_JIS or the similiar behavior for EUC-JP?

Assignee

Comment 21

•

23 years ago

>Does this mean, mapping like this? >0xfd -> 0xf8f1 >0xfe -> 0xf8f2 >0xff -> 0xf8f3 >But the comment says fc-ff (includes fc). good catch, it is fd-ff not fc. sorry. I will change the comment >So is the IE6 behavior to map 0x30fb (the case c) specific to Shift_JIS or the >similiar behavior for EUC-JP? Not sure, need develope more test. Let's fix it one by one. open bug 127275 for EUC-JP issue.

Assignee

Updated

•

23 years ago

Attachment #70437 - Attachment is obsolete: true

Assignee

Comment 22

•

23 years ago

Attached patch patch v2 — Details — Splinter Review

Assignee

Comment 23

•

23 years ago

nhotta or shanjian, please r=

Comment 24

•

23 years ago

Comment on attachment 70941 [details] [diff] [review] patch v2 r=nhotta

Attachment #70941 - Flags: review+

Assignee

Updated

•

23 years ago

Blocks: 104148

Comment 25

•

23 years ago

Comment on attachment 70941 [details] [diff] [review] patch v2 sr=kin@netscape.com

Attachment #70941 - Flags: superreview+

Assignee

Updated

•

23 years ago

Blocks: 104060
No longer blocks: 104148

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 26

•

23 years ago

Comment on attachment 70941 [details] [diff] [review] patch v2 a=roc+moz for 0.9.9

Attachment #70941 - Flags: approval+

Robert O'Callahan (:roc) (email my personal email if necessary)

Updated

•

23 years ago

Keywords: mozilla0.9.9+

Assignee

Comment 27

•

23 years ago

fixed and check in.

Status: ASSIGNED → RESOLVED

Closed: 23 years ago → 23 years ago

Resolution: --- → FIXED

Assignee

Updated

•

23 years ago

No longer blocks: 104060

Teruko Kobayashi

Updated

•

23 years ago

Status: RESOLVED → VERIFIED

Teruko Kobayashi

Comment 28

•

23 years ago

Verified as fixed in 0329 Win32 trunk and 0402 0.9.9ec Win32 build.

Simon Montagu :smontagu

Comment 29

•

16 years ago

Tests: http://hg.mozilla.org/mozilla-central/rev/fb086cc13695

Flags: in-testsuite+

You need to log in before you can comment on or make changes to this bug.