Closed
Bug 270638
Opened 20 years ago
Closed 13 years ago
Import kills 8-bit characters from subjects and addresses
Categories
(MailNews Core :: Import, defect)
Tracking
(Not tracked)
RESOLVED
DUPLICATE
of bug 686985
People
(Reporter: mozillabugzilla, Unassigned)
References
Details
Attachments
(1 file)
1.15 KB,
application/octet-stream
|
Details |
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0 When importing mails to TB from Outlook 2003, subject lines and addresses that contain 8-bit characters are destroyed. These look 100% OK in Outlook, but only garbage after imported in TB. *I* *KNOW* that 8-bit characters shouldn't appear in the subject lines, nor in the address fields, but many programs allow them, and it helps a lot if they are not destroyed in existing mails. They are also very important in international mails. PLEASE, HELP US TO GET OUR MAILS OUT OF THIS BLOODY PST FILE WITH AS LESS DAMAGE AS POSSIBLE! Reproducible: Always Steps to Reproduce: 1. Import mails in TB from Outlook that contain 8-bit chars in the Subject:, From:, To: lines etc. Actual Results: 8-bit chars contained there, are destroyed and replaced with garbage. Expected Results: Try to preserve the 8-bit characters, as they are very important in internationally-coded mails.
Comment 1•20 years ago
|
||
can you attach a sample message?
Reporter | ||
Comment 2•20 years ago
|
||
The "beforeImport.txt" file contains the "internet headers" of a short mail, as they appear inside Outlook 2003. Notice the "Subject:" line. The "afterImport.txt" file contains the same mail after import in TB. Notice how the "Subject:" line has changed.
Comment 3•20 years ago
|
||
cc'ing Jshin for advice - I assume we should detect the 8-bit characters and fix the subject to be mime-2 encoded correctly...
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment 4•20 years ago
|
||
Yes, that would be the best, but figuring out the character encoding used in the header could be difficult in some cases.
Reporter | ||
Comment 5•20 years ago
|
||
(In reply to comment #4) > Yes, that would be the best, but figuring out the character encoding used in the > header could be difficult in some cases. > > You don't need to look far, though... only two lines below the Subject: line, the correct encoding is displayed in its full glory... Of course, if you would have to guess, it would be much more difficult.
Comment 6•20 years ago
|
||
(In reply to comment #5) > You don't need to look far, though... only two lines below the Subject: line, > the correct encoding is displayed in its full glory... Gee, that's our best **guess**(note that RFC (2)822/RFC 204[4-8] don't specify the order of header fields), but it's not always correct(ok. 99% of cases, that's right). More importantly, 'charset' is not present in the outermost header if C-T is not 'text/*' (e.g. 'multipart/mixed', 'multipart/alternative'), in which case we have to look into the 'body' (one or more of subparts).
Comment 7•20 years ago
|
||
Not a TB auto-migration bug -> Core:MailNews:Import
Assignee: mscott → nobody
Component: Migration → MailNews: Import
Product: Thunderbird → Core
Version: unspecified → Trunk
Updated•16 years ago
|
QA Contact: import
Assignee | ||
Updated•16 years ago
|
Product: Core → MailNews Core
Comment 8•13 years ago
|
||
This is changed by the proposed patch for Bug 207156. However, this is not the solution. Instead of simply ruining the 8-bit characters, the code now tries to get the headers from Outlook in Unicode. Thus, it now relies on the Outlook to guess the headers encoding. I hope that this will solve majority of cases, but not all, and furthermore, it may introduce new mistakes in cases where it accidentally used to be ok.
Comment 9•13 years ago
|
||
patch was check into bug 686985
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → DUPLICATE
You need to log in
before you can comment on or make changes to this bug.
Description
•