Closed Bug 352842 Opened 18 years ago Closed 18 years ago

Bad import of UTF-8 ICS file with non-ASCII characters

Categories

(Calendar :: Import and Export, defect)

defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: mattwillis, Unassigned)

References

Details

(Keywords: dataloss, regression, Whiteboard: [patch in hand])

Attachments

(3 files)

When importing iCal.app's Birthday's calendar, it uses a right curved single quote (’) in the SUMMARY (ex: "SUMMARY:Matthew Willis’s Birthday") When this file (which opens properly in BBEdit as UTF-8, no BOM) is imported into Sunbird, the quote becomes mangled, and is displayed as "AcAA" where the c is a cents sign, the first A has a tilde above it and the other two have carats above them. Something's wacky here.
Attached file ICS from iCal.app —
The character in question, in HexEdit, from iCal's ICS: E2 80 99 This matches the info at http://www.fileformat.info/info/unicode/char/2019/index.htm The character in question, in HexEdit, from Sunbird's ICS: C3 83 C2 A2 C3 82 C2 80 C3 82 C2 99 We're adding C3 83 C2 to the first, and C3 82 C2 to the second and third, and th e first byte is becoming A2 rather than E2.
Flags: blocking0.3+
Keywords: dataloss
(In reply to comment #3) > The character in question, in HexEdit, from iCal's ICS: > E2 80 99 > The character in question, in HexEdit, from Sunbird's ICS: > C3 83 C2 A2 C3 82 C2 80 C3 82 C2 99 These bytes are getting misinterpreted as ISO-8859-1 and double-converted to UTF-8, not once but twice ISO-8859-1 UTF-8 ---------- ----- E2 80 99 --> C3 A2 C2 80 C2 99 C3 A2 C2 80 C2 99 --> C3 82 C2 A2 C3 82 C2 80 C3 82 C3 99
Attached patch use octet arrays — — Splinter Review
Somehow, using a string as array of didn't really work. This patch makes the importer really only use octet arrays until it has been decoded from utf8 into a string.
Attachment #238908 - Flags: second-review?(dmose)
Attachment #238908 - Flags: first-review?(mattwillis)
Whiteboard: [needs review dmose]
Comment on attachment 238908 [details] [diff] [review] use octet arrays This fixed it on my birthdays calendar with the smart quotes. r=lilmatt
Attachment #238908 - Flags: first-review?(mattwillis) → first-review+
regression from bug 315672
Keywords: regression
Whiteboard: [needs review dmose] → [patch in hand][needs review dmose]
*** Bug 352964 has been marked as a duplicate of this bug. ***
Updating summary to be more general
Blocks: UTF-import
Summary: Bad import of UTF-8 ICS file with smart-quotes from iCal.app → Bad import of UTF-8 ICS file with non-ASCII characters
Comment on attachment 238908 [details] [diff] [review] use octet arrays r=dmose
Attachment #238908 - Flags: second-review?(dmose) → second-review+
Whiteboard: [patch in hand][needs review dmose] → [patch in hand][needs checkin]
patch checked in
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
verified with Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a1) Gecko/20060921 Calendar/0.3a2+
Status: RESOLVED → VERIFIED
Whiteboard: [patch in hand][needs checkin] → [patch in hand][litmus testcase wanted]
Litmus testcase 2693 created
Whiteboard: [patch in hand][litmus testcase wanted] → [patch in hand]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: