Danish characters æøÆØ not correctly transliterated

RESOLVED WORKSFORME

Status

()

--
minor
RESOLVED WORKSFORME
17 years ago
2 years ago

People

(Reporter: bugzilla.mozilla.org-3, Assigned: mcmanus)

Tracking

({intl})

Trunk
Future
x86
Windows NT
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 1 obsolete attachment)

(Reporter)

Description

17 years ago
From Bugzilla Helper:
User-Agent: Mozilla/4.75 [en] (Windows NT 5.0; U)
BuildID:    2002020409

When sending a mail containing characters that are not found in the current 
character set, a the missing characters are converted into "similar looking" 
characters. E.g. Å is converted to A* and É to E'.

However, some characters like æøÆØ are converted into question marks.

Reproducible: Always

Steps to Reproduce:
1. Open mail compose window
2. Choose "Chinese Traditionel" charset in View menu (or some other 
   charset that does not contain the characters æøÆØ)
3. Enter (or paste) "æ ø å Æ Ø Å" in body of message
4. Send message
5. View the message in the Sent folder

Actual Results:
"æ ø å Æ Ø Å" is converted to "? ? a* ? ? A*"

Expected Results: 
"æ ø å Æ Ø Å" should be converted to "ae o/ a* AE O/ A*"

Transliteration is based on the file transliterate.properties 
<http://lxr.mozilla.org/mozilla/source/intl/unicharutil/tables/transliterate.pro
perties> generated by gentransliterate.pl 
<http://lxr.mozilla.org/seamonkey/source/intl/unicharutil/tools/gentransliterate
.pl>.

No transliteration entries are made for characters that do not have a 
decomposition specified in UnicodeData-Latest.txt (the fifth field; the variable 
$dec).

However, if I insert something like this at line 473 in gentransliterate.pl, 
entries for æøÆØ and a lot of other characters are made as well.

if(($cmt =~ "LATIN")  && ($cmt =~ "LETTER") && !($cmt =~ "LONG")) {
  $str = FromLatinComment($cmt);
  output($u,$cmt,$udec,$str);
}

This probably isn't the proper way to fix this, but I think it indicates, where 
the problem is.

Comment 1

17 years ago
Does Danish have characters which cannot be mapped by ISO-8859-1?
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: Danish characters æøÆØ not correctly transliterated → Danish characters æøÃ?Ã? not correctly transliterated
(Reporter)

Comment 2

17 years ago
No.
Summary: Danish characters æøÃ?Ã? not correctly transliterated → Danish characters æøÆØ not correctly transliterated

Updated

17 years ago
Status: NEW → ASSIGNED
Product: MailNews → Browser
Target Milestone: --- → Future
(Reporter)

Comment 3

17 years ago
Created attachment 68585 [details]
The bug description in HTML format

There seems to be some kind of character set confusion in Bugzilla (or
somewhere else) so that my bug report doesn't look like I entered it.

For clarity this attachment shows the bug description in HTML format.

Updated

17 years ago
Keywords: intl

Comment 4

17 years ago
This also happens to me when saved a draft message with swedish/finnish characters

1. Compose a mail with swedish/finnish characters едц in contents
2. save as draft
warning about charset
3. open draft
4. send mail
warning about charset
send as plain text

Result: swedish/finnish characters converted to ?

Expected result: swedish/finnish characters едц

Comment 5

17 years ago
Workaround when sending draft

1. Compose a mail with swedish/finnish characters едц in contents
2. save as draft
3. open draft
4. send mail
warning about charset
send in plain text and HTML
Result: swedish/finnish characters ok

Comment 6

10 years ago
I can't reproduce this in latest nightly of Thunderbird. Marking WFM.
Status: ASSIGNED → RESOLVED
Last Resolved: 10 years ago
Resolution: --- → WORKSFORME
Comment hidden (obsolete)
(Assignee)

Updated

2 years ago
Assignee: nhottanscp → mcmanus
(Assignee)

Updated

2 years ago
Attachment #8751687 - Attachment is obsolete: true
Attachment #8751687 - Flags: review?(valentin.gosu)
You need to log in before you can comment on or make changes to this bug.