Last Comment Bug 145293 - Import in text format (txt,csv,tab): Should support non-ascii display names and email addresses
: Import in text format (txt,csv,tab): Should support non-ascii display names a...
Status: RESOLVED FIXED
: dataloss, intl
Product: MailNews Core
Classification: Components
Component: Import (show other bugs)
: Trunk
: All All
: -- critical with 1 vote (vote)
: Thunderbird 16.0
Assigned To: Hiroyuki Ikezoe (:hiro)
:
:
Mentors:
: 164121 226813 522865 (view as bug list)
Depends on: 703503 773840 803835
Blocks: 157010 157673
  Show dependency treegraph
 
Reported: 2002-05-17 09:58 PDT by Cavin Song
Modified: 2012-10-24 00:51 PDT (History)
15 users (show)
ryanvm: in‑testsuite+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
Proposed fix (22.64 KB, patch)
2011-11-24 22:55 PST, Hiroyuki Ikezoe (:hiro)
no flags Details | Diff | Splinter Review
Some tests (3.99 KB, patch)
2011-11-24 22:57 PST, Hiroyuki Ikezoe (:hiro)
mconley: review+
Details | Diff | Splinter Review
Adapt to the latest trunk and some cleanup from the previous patch (21.28 KB, patch)
2012-06-27 23:52 PDT, Hiroyuki Ikezoe (:hiro)
no flags Details | Diff | Splinter Review
debitrotted tests (4.28 KB, patch)
2012-06-27 23:53 PDT, Hiroyuki Ikezoe (:hiro)
hiikezoe: review+
Details | Diff | Splinter Review
Adapt to the latest trunk and some cleanup from the previous patch (21.26 KB, patch)
2012-07-01 21:02 PDT, Hiroyuki Ikezoe (:hiro)
mozilla: review+
Details | Diff | Splinter Review

Description Cavin Song 2002-05-17 09:58:14 PDT
This is spun off bug 91295 as the code currently only works for ascii display
names and email addresses. Some notes from 91592:

a)  how does your code handle non-ASCII display names?

b) will this code work with IDNs in email addresses?  (does eudora support 
that?) see http://bugzilla.mozilla.org/show_bug.cgi?id=127399.
Comment 1 ji 2002-05-17 10:27:00 PDT
QA contact to myself.
Nominating for nsbeta1 since this used to be working.
Comment 2 Michael Buckland 2002-06-05 09:46:55 PDT
Discussed in mail news bug meeting.  Decided to minus this bug.
Comment 3 nhottanscp 2002-07-30 16:39:58 PDT
cavin, is this a bug for Eudora import?
Comment 4 Cavin Song 2002-07-30 17:29:48 PDT
No, I think this is a problem for all import.
Comment 5 nhottanscp 2002-07-30 17:45:43 PDT
okay, non ASCII e-mail address is not legal at this point I think no commercial
mailer support it, so we can focus on the non ASCII dispaly name problem.
Comment 6 nhottanscp 2002-08-05 11:43:51 PDT
nominate for nsbeta1
Comment 7 ji 2002-10-10 11:25:02 PDT
QA contact to Marina. Please reassgin to appropriate person.
Comment 8 Simon Montagu :smontagu 2002-11-22 14:09:25 PST
Gregg, please look at this bug too 
-dassi
Comment 9 Samir Gehani 2003-01-17 11:12:22 PST
Mail triage team: nsbeta1+/adt3
Comment 10 Peter Kirk 2003-07-30 01:55:35 PDT
I received e-mail with the source from line:

From: =?8859_1?B?S2FybGr8cmdlbg==?= Xxxxxxxxx <xxxxxxxxx@xxxxxx.com>

and Mozilla renders that first name as a Chinese character, 翽, which is U+7FFD
meaning "sounds of wings flapping". Well, the arguably input is garbage (it's
nothing like the guy's real name, and I usually get an accurate display of it
although it includes a u with umlaut) so I should expect to see garbage, but
certainly the input seems to be intended as an ISO 8859-1 string which cannot
include Chinese characters.

I am using Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624.
Comment 11 Wayne Mery (:wsmwk, NI for questions) 2006-01-26 14:45:43 PST
should Bug 163501 dup to or depend on this bug?
Comment 12 Wayne Mery (:wsmwk, NI for questions) 2007-09-15 15:01:32 PDT
Mark and others, 

a) is this really an import issue? or is it garden variety AB?

b) with no dupes or votes it's hard to imagine there aren't other bugs that haven't fixed or dupe parts of this bug.  Might some be
 bug 164121    	
 bug 129407
 bug 207998
And after those, what's not listed that's left to implement?

for the record, the spinoff reference in comment 0 is to end of bug 91295 comment 10.
Comment 13 Mark Banner (:standard8, limited time in Dec) 2007-09-16 13:59:14 PDT
(In reply to comment #12)
> a) is this really an import issue? or is it garden variety AB?

Not sure, I can't say I've really tried it, but given the lack of comments/other bugs about it, I'd be surprised if its a big problem still.

> b) with no dupes or votes it's hard to imagine there aren't other bugs that
> haven't fixed or dupe parts of this bug.  Might some be
>  bug 164121     
>  bug 129407
>  bug 207998
> And after those, what's not listed that's left to implement?

There's one somewhere about allowing different character sets for import, that's probably the more relevant one in this case. So we could probably close this one now.
Comment 14 Peter Kirk 2007-09-18 03:52:10 PDT
The e-mail mentioned in comment 10 now, in Thunderbird 2.0.0.6, displays with a Unicode unknown character symbol, a question mark in a diamond, rather than the Chinese character. As the input is garbage, that is probably correct behaviour, so at least this part of the bug seems to have been fixed.
Comment 15 Alex 2010-06-15 17:25:27 PDT
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100411 Icedove/3.0.4

The message from #10 still unreadable, despite my iconv knows that "8859_1" is valid codeset, and "S2FybGr8cmdlbg==" is decoded properly to "Karljürgen".
Comment 16 Wayne Mery (:wsmwk, NI for questions) 2011-02-27 20:54:23 PST
*** Bug 164121 has been marked as a duplicate of this bug. ***
Comment 17 Hiroyuki Ikezoe (:hiro) 2011-11-24 22:55:08 PST
Created attachment 576874 [details] [diff] [review]
Proposed fix

This patch can not be applied without the fix for bug 703175 since both of patches touch nsTextAddress::DetermineDelim.

This patch uses MsgDetectCharsetFromFile which is introduced in bug 703503.
Comment 18 Hiroyuki Ikezoe (:hiro) 2011-11-24 22:57:11 PST
Created attachment 576876 [details] [diff] [review]
Some tests
Comment 19 Hiroyuki Ikezoe (:hiro) 2011-11-24 22:57:44 PST
*** Bug 522865 has been marked as a duplicate of this bug. ***
Comment 20 Hiroyuki Ikezoe (:hiro) 2011-11-24 23:01:41 PST
*** Bug 226813 has been marked as a duplicate of this bug. ***
Comment 21 Mike Conley (:mconley) 2011-11-25 07:20:07 PST
Comment on attachment 576876 [details] [diff] [review]
Some tests

I haven't tried running the tests, but the code looks good to me.  Thanks.
Comment 22 Hiroyuki Ikezoe (:hiro) 2012-06-27 23:52:26 PDT
Created attachment 637389 [details] [diff] [review]
Adapt to the latest trunk and some cleanup from the previous patch
Comment 23 Hiroyuki Ikezoe (:hiro) 2012-06-27 23:53:07 PDT
Created attachment 637391 [details] [diff] [review]
debitrotted tests

carrying over review+.
Comment 24 Hiroyuki Ikezoe (:hiro) 2012-07-01 21:02:47 PDT
Created attachment 638259 [details] [diff] [review]
Adapt to the latest trunk and some cleanup from the previous patch
Comment 25 David :Bienvenu 2012-07-06 14:03:12 PDT
Comment on attachment 638259 [details] [diff] [review]
Adapt to the latest trunk and some cleanup from the previous patch

looks reasonable, tests pass, thx, Hiro!
Comment 27 neil@parkwaycc.co.uk 2012-07-13 09:42:19 PDT
Comment on attachment 638259 [details] [diff] [review]
Adapt to the latest trunk and some cleanup from the previous patch

>-        numQuotes += MsgCountChar(line, '"');
>+        numQuotes += line.CountChar(PRUnichar('"'));
Noooooooooooooooooooooo!

Note You need to log in before you can comment on or make changes to this bug.