Closed Bug 304885 Opened 19 years ago Closed 19 years ago

UTF-8 encoding mangles multipart messages, breaks whine emails

Categories

(Bugzilla :: Whining, defect)

2.21
defect
Not set
major

Tracking

()

RESOLVED FIXED
Bugzilla 2.22

People

(Reporter: cedric.caron, Assigned: karl)

Details

(Keywords: regression)

Attachments

(3 files)

My server is running under windows 2000 using IIS, perl 5.8.7 and the curent CVS version. I configured an event but the e-mail I reveice is empty. the title is OK but the body of the empty. any idea of aditional test I can do to fix this problem ? PS: I have Bug 135812 patch applied
I have several questions: 0) Do you see anything if you open the email in a different program? 1) What does the 'Content-Type' header say? Or is there no Content-Type header in the email at all? 2) What query does your whining entry execute? 3) Does this problem appear on 2.20rc2? Does this problem appear if you use the latest CVS version, without any extra patches on it? 4) What is the current subject and body of the whining entry? What happens if you change them?
0) I tryed with Outlook 2003 and the web interface provided by my ISP and the e-mail seems realy empty 1) see next coment 2)a very simple query returning all the unconfirmed, new, assigned and reopened bugs of a single product 3) my server is in "production" and its dificult to switch to an other version (I know CVS versions are not recomended for production) 4) the subject os "bugzilla status" and I tryed with the same body and with an empty body
Header of the received e-mail: MIME-Version: 1.0 Subject: [Bugzilla] Bugzilla Status X-Mailer: Mail::Mailer[v1.67] Net::SMTP[v2.29] X-Bugzilla-Type: whine Content-Type: multipart/alternative; boundary="-----=====-----2584--1124229602-----"; charset="UTF-8" To: cedric.caron@urbanet.ch Content-Transfer-Encoding: quoted-printable From: bugzilla-daemon@orchid-management.com Message-ID: <PROXYNQptgFrYILoygX0000002e@proxy.orchid-management.com> X-OriginalArrivalTime: 16 Aug 2005 22:00:06.0453 (UTC) FILETIME= [DCA15650:01C5A2AD] Date: 17 Aug 2005 00:00:06 +0200 X-Virus-Scanned: ClamAV version 0.86.2, clamav-milter version 0.86 on mx- 04.tornado.cablecom.ch X-Virus-Status: Clean X-Spam-Checker-Version: SpamAssassin 2.64-hispeed (2004-01-11) X-Spam-Status: No, hits=1.2 required=5.0 tests=AWL,CLICK_BELOW,HTML_30_40, HTML_LINK_CLICK_HERE,HTML_MESSAGE,MIME_MISSING_BOUNDARY,NO_REAL_NAME autolearn=no version=2.64-hispeed X-Spam-Level: *
Attached file e-mail dump...
I used the testfile mail mode to capteur the mail send by the server. The the header seems ok the the body is full of "3D" which prevent a corect MIME decoding... Any idea what can be the problem ?
Looks like Bug 126266 try to encode the e-mail and destroy the nice mime e- mail generated by whine.pl (my database is configured to use UTF-8 charset)
Thank you for the email dump. I believe you may be correct. The charset and Content-Transfer-Encoding headers were not originally set by whine.pl. As you noted, the new code from bug 126266 tries to modify the email. My guess is that the new re-encoding code from bug 126266 did not properly change the headers of the two alternative parts, which is why the email did not display. If this is correct, then this bug is probably a regression.
Tested on landfill and did not see this problem. However, I assume that is because the utf8 parameter is set to "off" on landfill. Anyway, here's my guess as to what went wrong: whine.pl calls Bugzilla::BugMail::MessageToMTA. On line 635 of BugMail.pm, Perl discovers that the utf8 parameter is on (which I assume is true from comment 5) and either the header or body is not 7-bit clean (I'm fuzzy in this area: I'm not sure why the message is not 7-bit clean, but this seems to be the case). Bugzilla::BugMail::encode_message is called for the message header & body (line 636/660). encode_message calls MIME::Parser::parse_data (line 668) on the headers (JUST the headers, not the body), returning a MIME::Entity. A call to MIME::Entity::head call is executed on the returned entity (line 669), which gives us a MIME::Head object. The utf8 character set is added to the Content-Type header (line 675), and various headers are examined. Eventually, we get to the body. The quoted-printable encoding is only set on line 714, which tells us that (a) the body is not 7-bit clean (line 709), and (b) more than half of the message is 7-bit clean (line 713-714). The quoted-printable header is added (line 714) and the processed message is returned back to the caller (at line 636). The message is then sent out. The flaw seems to be at the beginning of encode_message. Throughout encode_message, the $body is only touched to be encoded. encode_message never checks to see if the body of the message contains multiple parts, so the headers of those parts are never properly updated. This is unfortunate for whine.pl, which sends out multipart/alternative messages. I would guess the solution would be to change encode_message to check for a multipart message. Each part in the body would then be split up, encoded (with appropriate headers inserted), combined, and sent out. For now, my suggestion is this: set the 'regression' keyword, change hardware to 'All', change OS to 'All', set severity to 'major', target for 2.22, and assign to glob (the assignee of bug 126266). Also, change summary to something like "UTF-8 encoding mangles multipart messages, breaks whine emails".
(In reply to comment #7) > (I'm fuzzy in this area: I'm not sure why the message is not 7-bit clean, > but this seems to be the case). The bugs in my dtabase are in french and contains accents encoded using 8-bits
Assignee: erik → bugzilla
Severity: normal → major
Keywords: regression
OS: Windows 2000 → All
Hardware: PC → All
Summary: whine send empte e-mail → UTF-8 encoding mangles multipart messages, breaks whine emails
> The bugs in my dtabase are in french and contains accents encoded using 8-bits the quick fix is for whine.pl to use quote-printable or base64 encoding when constructing the mime message if it's not 7-bit clean. i don't think i'll have time to look at this for a while, reassigning to nobody.
Assignee: bugzilla → nobody
Target Milestone: --- → Bugzilla 2.22
Another solution may be to look for a boundary in the header, replace this boundary by a UTF-8 safe version and optionaly restore the original after conversion.
A quick fix for the boundayr problem: in whine.pl replace $args->{'boundary'} = "-----=====-----" . $$ . "--" . time() . "-----"; by $args->{'boundary'} = "----BugMail----" . $$ . "--" . time() . "-----"; this allow the e-mail to be displayed but all the 8bit caracters are corupted...
Should this be blocking 2.20.1 or 2.22?
Flags: blocking2.20.1?
Flags: blocking2.20.1?
Flags: blocking2.22?
another way to fix this is to call encode_message() for each of the individual mime parts, then join them together with mime boundaries and a normal message header. this way the entire message will be 7-bit clean (as it'll already be encoded) so encode_message won't mangle the boundaries.
That'd mean the callsite were responsible to do the encoding. How about we let the callsite hand over the parts individually, and BugMail.pm does the encoding and boundary-ing?
Flags: blocking2.22?
Flags: blocking2.22+
Flags: blocking2.20.1+
Target Milestone: Bugzilla 2.22 → Bugzilla 2.20
<bkor> justdave: bug 304885 is not a 2.20.1 blocker -- 2.20 does not have the utf8 parameter or the bugmail stuff to encode header/body for utf-8 (breaking whine)
Flags: blocking2.20.1+
Target Milestone: Bugzilla 2.20 → Bugzilla 2.22
I have started to work on this. If I don't have anything in a week I'll give it back to nobody.
Assignee: nobody → karl
Attached patch Patch v1Splinter Review
OK, here we go... The first thing I do is remove the = characters from the whine mail boundary, as they do freak out the encoding. (In reply to comment #14) > That'd mean the callsite were responsible to do the encoding. > How about we let the callsite hand over the parts individually, and BugMail.pm > does the encoding and boundary-ing? (In reply to comment #13) > another way to fix this is to call encode_message() for each of the individual > mime parts, then join them together with mime boundaries and a normal message > header. Both of those methods would require a good bit of additional work on the part of the code creating the message. It would be nice if it would be as simple as calling MessageToMTA and letting BugMail do the work, as it partially does now. That's the idea I went with, and here is what I have: First, I now pass the entire message into encode_message, and the entire message is now parsed by MIME::Parser. This takes care of recognizing and parsing all parts of the message. The code responsible for the actual encoding is now in a function called encode_message_entity, which takes a MIME::Entity as its parameter (and returns same). encode_message received the entity (which contains the newly-encoded data), extracts the header & body, and returns them as before. encode_message_entity contains much of the code from encode_message, with little change. There are, however, a few notable changes: * If a multipart message is detected, extract each part as its own entity and call encode_message_entity, instead of trying to examine a body that is not going to exist (multipart messages don't have bodies, just parts). * Do not try to set the content-type or charset on parts that contain no body * Do not do any actual encoding. Instead, just set the appropriate header and MIME::Tools will handle the encoding for us (since we have given it a body, not just headers)! I tested this by creating a whining event that contained many interesting characters (ü, é, â, etc.) in both the subject & body. In all cases (with quoted-printable and base64 encoding) my mailer (Apple Mail) was able to decode and display the message. Of course, this still needs additional testing 8-)
Attachment #203747 - Flags: review?
Status: NEW → ASSIGNED
Comment on attachment 203747 [details] [diff] [review] Patch v1 r=glob very nice, good job :) nits (can be fixed on checkin): > # read header into MIME::Entity this comment is no longer accurate - probably best to delete it >+ foreach my $part ($entity->parts) { >+ my $newpart = encode_message_entity($part); >+ push @$newparts, $newpart; indentation
Attachment #203747 - Flags: review? → review+
Flags: approval?
Flags: approval? → approval+
Checking in whine.pl; /cvsroot/mozilla/webtools/bugzilla/whine.pl,v <-- whine.pl new revision: 1.20; previous revision: 1.19 done Checking in Bugzilla/BugMail.pm; /cvsroot/mozilla/webtools/bugzilla/Bugzilla/BugMail.pm,v <-- BugMail.pm new revision: 1.61; previous revision: 1.60 done
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: