505 bytes, text/csv
514 bytes, text/csv
2.41 KB, patch
Magnus Melin: review+
|Details | Diff | Splinter Review|
Created attachment 8639803 [details] test.csv with one contact containing accented character User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:39.0) Gecko/20100101 Firefox/39.0 Build ID: 20150630154324 Steps to reproduce: Import an list of addresses from a CSV file. The file contains some records with accented characters, like for the firstname André (e-accent aigu at the end). The file is created by a program, but this bug didn't occur in the past, so the problem lies not there. Actual results: The import fails with message "Error importing address book xxxx, all addresses may not have been imported". In fact, none is imported. The import of exactly the same CSV file worked in previous releases. Expected results: Import should succeed. When I remove the accented characters with Notepad, the import works again. The bug can be reproduced by creating an address with an accented character, exporting to a CSV file and trying to import again. It will fail. The attached test.csv is an example.
Created attachment 8645394 [details] Another example I exported my address book and tried to re-import it and that failed. I tracked it down to an address with an accented character. This is exactly was is reported in comment #0 (last two lines). Works with TB 31.6.
OK, found the problem. In the case where the import contains the accented character (é or í in the attached examples), the input is not considered UTF-8 compatible and *no* character set is returned at all: https://dxr.mozilla.org/comm-central/source/mailnews/base/util/nsMsgUtils.cpp?offset=200#2420 This was regressed by bug 1074585 in https://hg.mozilla.org/comm-central/rev/fdfc4f1a8c6e. Just assigning ISO-8859-1 at the end of the function will make the import work again in this case: if (aCharset.IsEmpty()) aCharset.AssignLiteral("ISO-8859-1"); The function *MUST* return a characterset! I will NI the people who broke it ;-) General comment: I think it's pretty embarrassing that the CSV import doesn't work (and there is no text for it either). I was looking through the Outlook crashes and one comment was "Can't import CSV". That's how I found the bug. So the poor user tried the Outlook import first, that crashed (my fault, since I hadn't disabled it properly). Then he tried CSV and that didn't work either.
Great detective work, thx Jorg! I think we should just fix the caller - assign charset ISO-8859-1 (or UTF-8?) on failure here: https://dxr.mozilla.org/comm-central/source/mailnews/import/text/src/nsTextAddress.cpp#53
I think that should be two fixed: One quick one and one with a UI change that lets the user select the character set if it can't be determined automatically. Given that TB is used a lot in Europe and America, we should at least detect Windows-1252 (superset of ISO8859-1).
https://support.microsoft.com/en-us/kb/933855 suggests UTF-8 is what's always used nowadays. (Haven't tested it though.)
Hmm, well, that's another problem: The format that TB uses upon export. I exported and got "Western". We can change that too, but we should also do something useful if it's not UTF-8 upon import. The problem of this bug is that the é/í are already encoded in "Western", so the UTF-8 test fails (saw that in debugging). Just assigning "UTF-8" gives �, tried that, too.
Workaround for import: Convert "Western" encoded CSV files to UTF-8, for example in Notepad++ on Windows.
Created attachment 8645489 [details] [diff] [review] Proposed solution: Quick fix. Whoever has time and deems this important can review this. Code was reestablished in the caller from https://hg.mozilla.org/comm-central/rev/fdfc4f1a8c6e (Magnus suggested to put something into the caller (comment #4), that's what I've done.)
nsMsgUtils.cpp should probably be changed to return error if no charset was detected
MsgDetectCharsetFromFile I mean.
If you always want to export UTF-8, then the change would be here: https://dxr.mozilla.org/comm-central/source/mailnews/addrbook/src/nsAbManager.cpp#702 https://dxr.mozilla.org/comm-central/source/mailnews/addrbook/src/nsAbManager.cpp#793 Should I add this to the patch?
(In reply to Magnus Melin from comment #11) > nsMsgUtils.cpp should probably be changed to return error if no charset was > detected Which error would you like?
NS_ERROR_FAILURE perhaps? Export should be in another bug, maybe bug 117236.
Created attachment 8645493 [details] [diff] [review] Proposed solution: Quick fix (v2) As per Magnus' comment #15.
Some homework: MsgDetectCharsetFromFile has two callers: http://mxr.mozilla.org/comm-central/search?string=MsgDetectCharsetFromFile One in /mailnews/import/text/src/nsTextAddress.cpp (where the new return value is handled) and one in nsMsgAttachmentHandler.cpp. There, in nsMsgAttachmentHandler::PickCharset, the return value is just returned. This function has one caller only: http://mxr.mozilla.org/comm-central/source/mailnews/compose/src/nsMsgSend.cpp#1023 which ignores the return value. In conclusion: Changing the behaviour and returning a failure code from MsgDetectCharsetFromFile now won't cause any unexpected side-effects.
Created attachment 8645836 [details] [diff] [review] Proposed solution: Quick fix (v3) Better way to do it.
Comment on attachment 8645836 [details] [diff] [review] Proposed solution: Quick fix (v3) Review of attachment 8645836 [details] [diff] [review]: ----------------------------------------------------------------- LGTM, thx Jorg! r=mkmelin
Comment on attachment 8645836 [details] [diff] [review] Proposed solution: Quick fix (v3) [Approval Request Comment] Regression caused by (bug #): 1074585 User impact if declined: Major function not working. Testing completed (on c-c, etc.): Yes. Risk to taking this patch (and alternatives if risky): None.
Comment on attachment 8645836 [details] [diff] [review] Proposed solution: Quick fix (v3) http://hg.mozilla.org/releases/comm-esr38/rev/97b27d9d64f1
Comment on attachment 8645836 [details] [diff] [review] Proposed solution: Quick fix (v3) https://hg.mozilla.org/releases/comm-beta/rev/9d3eefbed90e
Comment on attachment 8645836 [details] [diff] [review] Proposed solution: Quick fix (v3) [Triage Comment] https://hg.mozilla.org/releases/comm-aurora/rev/3b0aef562c40