Last Comment Bug 357172 - mail invitation text garbled (VCALENDAR not recognized as UTF-8)
: mail invitation text garbled (VCALENDAR not recognized as UTF-8)
Status: VERIFIED FIXED
: verified1.8.1.13
Product: MailNews Core
Classification: Components
Component: MIME (show other bugs)
: unspecified
: All All
: -- major with 1 vote (vote)
: mozilla1.9beta4
Assigned To: Christian Schmidt
:
Mentors:
http://mxr.mozilla.org/mozilla1.8/sou...
: 359688 404476 421911 422351 423743 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2006-10-18 12:44 PDT by Matp75
Modified: 2008-07-31 04:30 PDT (History)
16 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
Test invitation (mail) generated from outlook 2003 FR with accents (4.87 KB, text/plain)
2006-10-18 12:46 PDT, Matp75
no flags Details
Screenshot of the problem (87.21 KB, image/jpeg)
2006-10-18 12:47 PDT, Matp75
no flags Details
Screenshot of the test message in Linux/Tb2/Lightning 0.7 (63.54 KB, image/png)
2008-02-14 14:21 PST, Rimas Kudelis
no flags Details
Set default charset for text/calendar to UTF-8 (708 bytes, patch)
2008-02-17 12:50 PST, Christian Schmidt
no flags Details | Diff | Review
Patch, v. 2 (1.54 KB, patch)
2008-02-18 14:43 PST, Christian Schmidt
no flags Details | Diff | Review
Patch, v. 3 (1.54 KB, patch)
2008-02-19 12:22 PST, Christian Schmidt
mozilla: review+
Details | Diff | Review
Patch, v. 4 (1.57 KB, patch)
2008-02-19 14:01 PST, Christian Schmidt
bugzilla: review+
mozilla: superreview+
dveditz: approval1.8.1.13+
Details | Diff | Review

Description Matp75 2006-10-18 12:44:51 PDT
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.8.1) Gecko/20061010 Firefox/2.0
Build Identifier: Thunderbird 1.5.0.7(20060909) ligthning (2006100618)

when receiving an invitation with accent, garbage is shown for accents.

Reproducible: Always

Steps to Reproduce:
1. send an invitation from outlook with accent in the subject, location or text
2. receive the invitation in ligthning
3. accent are not shown. garbage instead (see screenshot)




attached : exemple invitation and screenshot of the problem
Comment 1 Matp75 2006-10-18 12:46:45 PDT
Created attachment 242674 [details]
Test invitation (mail) generated from outlook 2003 FR with accents
Comment 2 Matp75 2006-10-18 12:47:46 PDT
Created attachment 242675 [details]
Screenshot of the problem
Comment 3 cmtalbert 2006-10-18 16:22:53 PDT
Reporter, could you email me the "test invitation" that you have above?
This will help me to fix the problem much quicker than I could otherwise.

Evidently this is happening because there is a difference between how Microsoft Outlook 2003 with Exchange sends its events and how Microsoft Outlook 2003 without Exchange sends its events. I attempted to recreate the scenario with My Outlook 2003 and it works OK. But, I am not on an Exchange server, and I think that is the big difference.

Please send the email to cmtalbert at myfastmail.com.

Thank you very much. 
Once that's sent to me, I'll be able to confirm the bug as well.
Comment 4 cmtalbert 2007-01-22 10:04:04 PST
*** Bug 359688 has been marked as a duplicate of this bug. ***
Comment 5 cmtalbert 2007-01-22 10:08:21 PST
I can confirm this (sorry for the delay).

Confirmed on Lightning: 2007011603 in Thunderbird: Version 2 beta 1 (20070110)
Comment 6 Jonathan Ernst 2007-09-26 04:29:03 PDT
This happens also with invitations created from thunderbird/lightning.
Comment 7 Magnus Melin 2007-11-22 09:30:37 PST
Note that this happens for all invitations from Exchange, and the invitation is even standards compliant AFAICT.

RFC 2445: 4.1.4 Character Set

   There is not a property parameter to declare the character set used
   in a property value. The default character set for an iCalendar
   object is UTF-8 as defined in [RFC 2279].

   The "charset" Content-Type parameter can be used in MIME transports
   to specify any other IANA registered character set.

I have a hard time thinking this isn't *very* visible for non-english users.
Comment 8 Magnus Melin 2007-11-22 09:31:30 PST
*** Bug 404476 has been marked as a duplicate of this bug. ***
Comment 9 Pedro Madeira 2007-11-22 09:41:05 PST
The calendar (lightning) should create the CalendarEvent.ics retrieving the
encoding used in Thunderbird to compose messages.

The problem that I have is that I use ISO 8859-1 to view and compose messages
but the file that lightning builds is in UTF-8 so when I receive a mail
notification the body of the email seems alright but the calendar event special
characters are all unreadable until I switch the viewing character encoding to
UTF-8 but this is not a solution as I want to view most of my mails as ISO
8859-1.

So if Lightning creates the attached event using the character encoding
retrieved by Thunderbird composition all should be fine I think.
Comment 10 Mauro Cicognini 2007-11-22 12:39:39 PST
(In reply to comment #7)
> 
> I have a hard time thinking this isn't *very* visible for non-english users.
> 

And indeed it is. It is _very_ evident, so much so that I was assuming that it was just a temporary glitch and someone was already working on it.
Comment 11 Simon Paquet [:sipaq] 2008-01-04 06:04:43 PST
Clint, are you still planning to work on this? It would be cool to get a fix for that into 0.8.
Comment 12 Rimas Kudelis 2008-02-14 10:25:07 PST
(In reply to comment #9)
> The calendar (lightning) should create the CalendarEvent.ics retrieving the
> encoding used in Thunderbird to compose messages.

It would have to specify its character set in the MIME headers of the attachment then.
Actually, how it currently creates the attachment is the right way, I guess. The problem is the way it handles the attachment it recieves.

> The problem that I have is that I use ISO 8859-1 to view and compose messages
> but the file that lightning builds is in UTF-8 so when I receive a mail
> notification the body of the email seems alright but the calendar event special
> characters are all unreadable until I switch the viewing character encoding to
> UTF-8 but this is not a solution as I want to view most of my mails as ISO
> 8859-1.
> 
> So if Lightning creates the attached event using the character encoding
> retrieved by Thunderbird composition all should be fine I think.

No, I guess it should simply take into account the fact that the charsets may (and often will) differ between the textual part of the message and the iCal attachment.
Comment 13 Rimas Kudelis 2008-02-14 10:29:53 PST
Just to summarize: the problem is in handling of recieved invitations, not in sending them.

[Changing version to Unspecified, as it still appears in 0.7.]
Comment 14 cmtalbert 2008-02-14 13:45:11 PST
I'm going to look into this and try to get it into 0.8.  If it is solvable in the 0.8 timeframe, then good, if not, then we'll get it in the next release, but we aren't going to hold 0.8 for it.  I'm sorry for the inconvenience.
Comment 15 Rimas Kudelis 2008-02-14 13:51:04 PST
In case it's not easily solvable, I have an idea for a temporary workaround. Could you perhaps use HTML entities (such as €) for non-ascii characters?
Comment 16 Rimas Kudelis 2008-02-14 14:21:48 PST
Created attachment 303371 [details]
Screenshot of the test message in Linux/Tb2/Lightning 0.7

Oddly enough, I don't spot the problem when opening the attached message with Lightning 0.7 + Thunderbird 2.0.0.6 on Linux...
Comment 17 Christian Schmidt 2008-02-17 12:50:06 PST
Created attachment 303902 [details] [diff] [review]
Set default charset for text/calendar to UTF-8

Re: comment 7:
According to RFC 2046 4.1.2, if a MIME entity does not have the charset specified in the Content-Type header, US-ASCII must be assumed, so a MIME entity that contains 8bit characters but doesn't have an explicit charset specified is not standards compliant AFAICT.


I am not sure it is the right place to hook in, but this patch appears to fix the problem by changing the default character set from ISO-8859-1 to UTF-8 for text/calendar entities.

Currently Thunderbird defaults to ISO-8859-1 for text/calendar entities without an explicit charset. Changing this to UTF-8 wouldn't make Thunderbird less standards compliant, because US-ASCII is a subset of both of these character sets, so any valid text/calendar entity containing only 7bit characters would be handled the same, no matter whether ISO-8859-1 or UTF-8 is expected.


BTW, I noticed that if you turn on charset auto-detection using View > Character Encoding > Auto-Detect > Universal, Thunderbird does properly detect UTF-8, even for text/calendar entities, so this bug is only visible when auto-detection is disabled.
Comment 18 Rimas Kudelis 2008-02-18 02:59:50 PST
(In reply to comment #17)

> Re: comment 7:
> According to RFC 2046 4.1.2, if a MIME entity does not have the charset
> specified in the Content-Type header, US-ASCII must be assumed, so a MIME
> entity that contains 8bit characters but doesn't have an explicit charset
> specified is not standards compliant AFAICT.

RFC 2046 4.1.2 speaks about text/plain explicitly, stating that "The specification for any future subtypes of "text" must specify whether or not they will also utilize a "charset" parameter, and may possibly restrict its values as well."

RFC 2445 is of higher priority here, I think.
Comment 19 Rimas Kudelis 2008-02-18 03:15:31 PST
Christian, BTW shouldn't your code go above the X-Sun-Charset hook (11 lines up)?
Comment 20 Christian Schmidt 2008-02-18 03:47:27 PST
I found a new RFC with an even higher number :-) RFC 2447 about iMIP explicitly says:
   A "charset" parameter MUST be present if the iCalendar object
   contains characters that are not part of the US-ASCII character set.

AFAICT attachment 242674 [details] violates this. But as mentioned, Thunderbird can still assume UTF-8 without violating the requirement that it must assume US-ASCII. The only potential problem is if some clients rely on that Thunderbird currently assumes ISO-8859-1, but given that Outlook uses UTF-8, this is hardly a big problem.


>BTW shouldn't your code go above the X-Sun-Charset hook (11 lines up)?
I assume the X-Sun-Charset is some kind of legacy support. I don't know if this header is ever used in connection with iMIP in practice, but I would assume that /if/ the X-Sun-Charset header is present in a text/calendar entity, it means that the entity is encoded using that charset - and thus the X-Sun-Charset check should precede the text/calendar check.
Comment 21 Rimas Kudelis 2008-02-18 12:41:20 PST
I wonder how they managed to release two comflicting specifications at the same time.

Or perhaps I simply don't quite get something...
Comment 22 Magnus Melin 2008-02-18 12:50:57 PST
Comment on attachment 303902 [details] [diff] [review]
Set default charset for text/calendar to UTF-8

You could use TEXT_CALENDAR instead. Extra points for more context: -up9 or 8;) 

Really great if this works!
Comment 23 Christian Schmidt 2008-02-18 14:43:10 PST
Created attachment 304099 [details] [diff] [review]
Patch, v. 2

More context, updated comment and now uses a constant.

I wonder if anybody is relying on the current ISO-8859-1 default? In any way, if we are to accept non-compliant iMIP messages, support for MS Outlook is probably of higher priority compared to other clients (though we could of course add additional sniffs and support both if it was really important).
Comment 24 Rimas Kudelis 2008-02-18 23:33:37 PST
Comment on attachment 304099 [details] [diff] [review]
Patch, v. 2

I suppose you should use /* foobar */ for comments, just like it is above...
Comment 25 Magnus Melin 2008-02-19 12:05:52 PST
Comment on attachment 304099 [details] [diff] [review]
Patch, v. 2

Looks good to me, and I can confirm it works. The comment style could be changed to be consistent with the rest of the file.

I think "When no default charset" should read "When no charset".

Try bienvenu for official review (and sr, but there is no sr field in calendar)
Comment 26 Christian Schmidt 2008-02-19 12:22:51 PST
Created attachment 304288 [details] [diff] [review]
Patch, v. 3

Thanks :-) The comment is now reworded, and the comment style has been changed to match the majority of comments in the file.
Comment 27 David :Bienvenu 2008-02-19 12:28:35 PST
Comment on attachment 304288 [details] [diff] [review]
Patch, v. 3

looks good, thx, Christian. One nit - unless PL_strcasecmp checks for a null argument, you might want to make sure that obj->content_type isn't null. I'm not sure you can get in here with a null content type, but we don't want to crash if we do.
Comment 28 Christian Schmidt 2008-02-19 14:01:36 PST
Created attachment 304311 [details] [diff] [review]
Patch, v. 4

Thanks for the very fast review. I have added a null check for obj->content_type (obj->content_type is also checked for null elsewhere in the file, so it probably isn't guaranteed to be non-null).

>Try bienvenu for official review (and sr, but there is no sr field in calendar)
The same person can't grant both r and sr, can he? If not, I'll ask for sr from some of the other mailnews super-reviewer.
Comment 29 Simon Paquet [:sipaq] 2008-02-19 14:11:41 PST
(In reply to comment #28)
>>Try bienvenu for official review (and sr, but there is no sr field in calendar)
>
>The same person can't grant both r and sr, can he?

Yes, he can. And this is what bienvenu normally does with the few exceptions where he is soliciting for a second opinion.

BTW since this patch only covers code in mozilla/mailnews, we should move the bug there.
Comment 30 Simon Paquet [:sipaq] 2008-02-19 14:15:20 PST
Comment on attachment 304311 [details] [diff] [review]
Patch, v. 4

Carrying over r+ but asking for sr for completeness.
Comment 31 David :Bienvenu 2008-02-19 14:17:00 PST
Comment on attachment 304311 [details] [diff] [review]
Patch, v. 4

thx for the patch!
Comment 32 Rimas Kudelis 2008-02-19 23:26:12 PST
Could we expect this in Thunderbird 2.0.0.x?
Comment 33 Magnus Melin 2008-02-20 10:40:51 PST
Yes, ask approval after some days baking on trunk. 

Checking in mailnews/mime/src/mimetext.cpp;
/cvsroot/mozilla/mailnews/mime/src/mimetext.cpp,v  <--  mimetext.cpp
new revision: 1.56; previous revision: 1.55
done

->FIXED
Comment 34 Magnus Melin 2008-03-04 09:41:08 PST
Comment on attachment 304311 [details] [diff] [review]
Patch, v. 4

Asking branch approval for this, it's needed to  handle outlook invites properly.
Comment 35 Daniel Veditz [:dveditz] 2008-03-04 10:36:26 PST
Comment on attachment 304311 [details] [diff] [review]
Patch, v. 4

approved for 1.8.1.13, a=dveditz for release-drivers
Comment 36 Magnus Melin 2008-03-04 12:12:34 PST
I'll try to check it in tomorrow.
Comment 37 Magnus Melin 2008-03-05 09:33:26 PST
Checked in on the MOZILLA_1_8_BRANCH:

Checking in mailnews/mime/src/mimetext.cpp;
/cvsroot/mozilla/mailnews/mime/src/mimetext.cpp,v  <--  mimetext.cpp
new revision: 1.52.4.1; previous revision: 1.52
done

(Seems I messed up the checkin comment, r was not standard8 - but bienvenu.)
Comment 38 Norbert Püschel 2008-03-10 11:06:02 PDT
*** Bug 421911 has been marked as a duplicate of this bug. ***
Comment 39 Stefan Sitter 2008-03-12 03:33:22 PDT
*** Bug 422351 has been marked as a duplicate of this bug. ***
Comment 40 Stefan Sitter 2008-03-18 15:19:59 PDT
*** Bug 423743 has been marked as a duplicate of this bug. ***
Comment 41 cmtalbert 2008-03-21 15:03:53 PDT
Verified on version 2.0.0.14pre (20080321) with Lightning build from 3/11/08

Note You need to log in before you can comment on or make changes to this bug.