Closed Bug 226569 Opened 21 years ago Closed 19 years ago

Problem with last character whose Unicode ends with FF

Categories

(Calendar :: General, defect)

x86
All
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: Chia-Hung.Cheng, Assigned: mostafah)

References

Details

(Keywords: intl)

Attachments

(7 files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4.1) Gecko/20031008
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4.1) Gecko/20031008

This problem occurs each time when the last traditional chinese character of an
entry ends with Unicode FF. After reopen mozilla & calendar (Quick Launch must
be disabled), all non ascii characters are display as "?".

Reproducible: Always

Steps to Reproduce:
1. Edit new Event or Task
2. Type any characters (e.g. ascii + german umlaut + traditional chinese) in
Title/Location/Note ending with a Traditional Chinese character, for which the
Unicode ends with FF like 駿 (99FF) or 仿 (4EFF). Do not add any punctuation
mark (such as !?",. etc.) after this Traditional Chinese character. Till now,
all character are displayed correctly and correctly entered in the ICS file.
3. Close calendar and all other Mozilla windows (Quick Launch must be disabled).
4. Reopen calendar, the "all non ascii" characters were replaced with ????s. See
the examples below:

Only Traditional Chinese characters: "測試駿" will be "????????" (but "測試駿,"
is ok)
ascii characters with Traditional Chinese character: "test駿" will be  "test??"
(but "test駿123" is ok)
German Umlaut with Traditional Chinese character: "Prüfen 駿" will be  "Pr??fen
??" (but "Prüfen 駿," is ok)

Those non ascii characters are still correct in the ICS file, but calendar
Event/Task Display does not return the correct string.





Expected Results:  
This bug should be fixed.

All affected Traditional Chinese characters which i had proved:
(CJK Unified Ideographs, Range: 4E00–9FAF http://www.unicode.org/charts/ )

4EFF 仿
52FF 勿
56FF 囿
59FF 姿
5EFF 廿
5FFF 忿
62FF 拿
66FF 替
67FF 柿
6EFF 滿
6FFF 濿
71FF 燿
75FF 痿
7AFF 竿
97FF 響
99FF 駿
Attached file Additional Description —
Attached image After "Reopen Mozilla&Calendar" —
When you copy the event, which you've just entered with the affected
Traditional Chinese characters at the end, the bug appears.
This is a serious bug for Chinese users and makes many terrbles.  And to make
the calender Eastern Char friendly, this bug should be fixed as soon as
possible.  I can't believe the status is still UNCONFIRMED even now.
Confirm.

Opening the example ics file in Calendar displays the "姿OK" title correctly
but displays the "Bad姿" as "Bad??".

Opening the example ics as a local file in Firefox displays correctly, so the
character is initially saved correctly, but corrupted on load from file.

A potential function which overwrites characters with '?' is strForceUTF8
http://lxr.mozilla.org/mozilla/source/calendar/libxpical/oeICalEventImpl.cpp#200


It overwrites if IsValidUTF8 returns false
http://lxr.mozilla.org/mozilla/source/calendar/libxpical/oeICalEventImpl.cpp#200


Looking at the file in wordpad (non-utf8), it looks like utf-8 encoding of
姿 (姿, 'appearance') is 0xE5,0xA7,0xBF (姿).  If so, it should take
the 3-byte branch, correctly skip over the 3-byte character, and find the end.

Since it skips over the character correctly when it is not at the end, the
problem may be before it gets to IsValidUTF8, so maybe the buffer does not have
the complete string.  Interestingly, there are only two ?? while there are
three bytes in the utf-8, so maybe the string is one character short.

I hope this helps.
status -> NEW
Status: UNCONFIRMED → NEW
Ever confirmed: true
Possibly related (maybe a dupe) is bug 173039.
Keywords: intl
OS: Windows 2000 → All
gee.. |IsValidUTF8| doesn't do the right thing in the first place although that
doesn't seem to be the cause of this bug. It seems like calendar needs some
'i18n love'. I'll build a calendar (sunbird) and see what's going on. 
173039 looks the same.
But in Sunbird 0.2b the problem seems to be resolved, at least with cyrillic
chars. I can't reproduce the bug anymore.
Blocks: 162454
this bug also apears in Greek text - so it's not about Chineese only. Most
likely affects all unicode languages
QA Contact: gurganbl → general
The attached example WFM in nightly builds of Sunbird.  If anyone can still reproduce this bug in a current build they should feel free to reopen.
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: