Closed Bug 123969 Opened 23 years ago Closed 15 years ago

Danish characters æøÆØ not correctly transliterated

Categories

(Core :: Internationalization, defect)

x86
Windows NT
defect
Not set
minor

Tracking

()

RESOLVED WORKSFORME
Future

People

(Reporter: bugzilla.mozilla.org-3, Assigned: mcmanus)

Details

(Keywords: intl)

Attachments

(1 file, 1 obsolete file)

From Bugzilla Helper:
User-Agent: Mozilla/4.75 [en] (Windows NT 5.0; U)
BuildID:    2002020409

When sending a mail containing characters that are not found in the current 
character set, a the missing characters are converted into "similar looking" 
characters. E.g. Å is converted to A* and É to E'.

However, some characters like æøÆØ are converted into question marks.

Reproducible: Always

Steps to Reproduce:
1. Open mail compose window
2. Choose "Chinese Traditionel" charset in View menu (or some other 
   charset that does not contain the characters æøÆØ)
3. Enter (or paste) "æ ø å Æ Ø Å" in body of message
4. Send message
5. View the message in the Sent folder

Actual Results:
"æ ø å Æ Ø Å" is converted to "? ? a* ? ? A*"

Expected Results: 
"æ ø å Æ Ø Å" should be converted to "ae o/ a* AE O/ A*"

Transliteration is based on the file transliterate.properties 
<http://lxr.mozilla.org/mozilla/source/intl/unicharutil/tables/transliterate.pro
perties> generated by gentransliterate.pl 
<http://lxr.mozilla.org/seamonkey/source/intl/unicharutil/tools/gentransliterate
.pl>.

No transliteration entries are made for characters that do not have a 
decomposition specified in UnicodeData-Latest.txt (the fifth field; the variable 
$dec).

However, if I insert something like this at line 473 in gentransliterate.pl, 
entries for æøÆØ and a lot of other characters are made as well.

if(($cmt =~ "LATIN")  && ($cmt =~ "LETTER") && !($cmt =~ "LONG")) {
  $str = FromLatinComment($cmt);
  output($u,$cmt,$udec,$str);
}

This probably isn't the proper way to fix this, but I think it indicates, where 
the problem is.
Does Danish have characters which cannot be mapped by ISO-8859-1?
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: Danish characters æøÆØ not correctly transliterated → Danish characters æøÃ?Ã? not correctly transliterated
No.
Summary: Danish characters æøÃ?Ã? not correctly transliterated → Danish characters æøÆØ not correctly transliterated
Status: NEW → ASSIGNED
Product: MailNews → Browser
Target Milestone: --- → Future
There seems to be some kind of character set confusion in Bugzilla (or
somewhere else) so that my bug report doesn't look like I entered it.

For clarity this attachment shows the bug description in HTML format.
Keywords: intl
This also happens to me when saved a draft message with swedish/finnish characters

1. Compose a mail with swedish/finnish characters едц in contents
2. save as draft
warning about charset
3. open draft
4. send mail
warning about charset
send as plain text

Result: swedish/finnish characters converted to ?

Expected result: swedish/finnish characters едц
Workaround when sending draft

1. Compose a mail with swedish/finnish characters едц in contents
2. save as draft
3. open draft
4. send mail
warning about charset
send in plain text and HTML
Result: swedish/finnish characters ok
I can't reproduce this in latest nightly of Thunderbird. Marking WFM.
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → WORKSFORME
Assignee: nhottanscp → mcmanus
Attachment #8751687 - Attachment is obsolete: true
Attachment #8751687 - Flags: review?(valentin.gosu)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: