Closed
Bug 304885
Opened 19 years ago
Closed 19 years ago
UTF-8 encoding mangles multipart messages, breaks whine emails
Categories
(Bugzilla :: Whining, defect)
Tracking
()
RESOLVED
FIXED
Bugzilla 2.22
People
(Reporter: cedric.caron, Assigned: karl)
Details
(Keywords: regression)
Attachments
(3 files)
7.88 KB,
text/plain
|
Details | |
4.10 KB,
patch
|
glob
:
review+
|
Details | Diff | Splinter Review |
7.22 KB,
patch
|
Details | Diff | Splinter Review |
My server is running under windows 2000 using IIS, perl 5.8.7 and the curent
CVS version.
I configured an event but the e-mail I reveice is empty.
the title is OK but the body of the empty.
any idea of aditional test I can do to fix this problem ?
PS: I have Bug 135812 patch applied
Assignee | ||
Comment 1•19 years ago
|
||
I have several questions:
0) Do you see anything if you open the email in a different program?
1) What does the 'Content-Type' header say? Or is there no Content-Type
header in the email at all?
2) What query does your whining entry execute?
3) Does this problem appear on 2.20rc2? Does this problem appear if you use
the latest CVS version, without any extra patches on it?
4) What is the current subject and body of the whining entry? What happens if
you change them?
Reporter | ||
Comment 2•19 years ago
|
||
0) I tryed with Outlook 2003 and the web interface provided by my ISP and the
e-mail seems realy empty
1) see next coment
2)a very simple query returning all the unconfirmed, new, assigned and
reopened bugs of a single product
3) my server is in "production" and its dificult to switch to an other version
(I know CVS versions are not recomended for production)
4) the subject os "bugzilla status" and I tryed with the same body and with an
empty body
Reporter | ||
Comment 3•19 years ago
|
||
Header of the received e-mail:
MIME-Version: 1.0
Subject: [Bugzilla] Bugzilla Status
X-Mailer: Mail::Mailer[v1.67] Net::SMTP[v2.29]
X-Bugzilla-Type: whine
Content-Type: multipart/alternative;
boundary="-----=====-----2584--1124229602-----";
charset="UTF-8"
To: cedric.caron@urbanet.ch
Content-Transfer-Encoding: quoted-printable
From: bugzilla-daemon@orchid-management.com
Message-ID: <PROXYNQptgFrYILoygX0000002e@proxy.orchid-management.com>
X-OriginalArrivalTime: 16 Aug 2005 22:00:06.0453 (UTC) FILETIME=
[DCA15650:01C5A2AD]
Date: 17 Aug 2005 00:00:06 +0200
X-Virus-Scanned: ClamAV version 0.86.2, clamav-milter version 0.86 on mx-
04.tornado.cablecom.ch
X-Virus-Status: Clean
X-Spam-Checker-Version: SpamAssassin 2.64-hispeed (2004-01-11)
X-Spam-Status: No, hits=1.2 required=5.0 tests=AWL,CLICK_BELOW,HTML_30_40,
HTML_LINK_CLICK_HERE,HTML_MESSAGE,MIME_MISSING_BOUNDARY,NO_REAL_NAME
autolearn=no version=2.64-hispeed
X-Spam-Level: *
Reporter | ||
Comment 4•19 years ago
|
||
I used the testfile mail mode to capteur the mail send by the server.
The the header seems ok the the body is full of "3D" which prevent a corect
MIME decoding...
Any idea what can be the problem ?
Reporter | ||
Comment 5•19 years ago
|
||
Looks like Bug 126266 try to encode the e-mail and destroy the nice mime e-
mail generated by whine.pl (my database is configured to use UTF-8 charset)
Assignee | ||
Comment 6•19 years ago
|
||
Thank you for the email dump. I believe you may be correct.
The charset and Content-Transfer-Encoding headers were not originally set by
whine.pl. As you noted, the new code from bug 126266 tries to modify the email.
My guess is that the new re-encoding code from bug 126266 did not properly
change the headers of the two alternative parts, which is why the email did not
display. If this is correct, then this bug is probably a regression.
Assignee | ||
Comment 7•19 years ago
|
||
Tested on landfill and did not see this problem. However, I assume that is
because the utf8 parameter is set to "off" on landfill. Anyway, here's my guess
as to what went wrong:
whine.pl calls Bugzilla::BugMail::MessageToMTA. On line 635 of BugMail.pm, Perl
discovers that the utf8 parameter is on (which I assume is true from comment 5)
and either the header or body is not 7-bit clean (I'm fuzzy in this area: I'm
not sure why the message is not 7-bit clean, but this seems to be the case).
Bugzilla::BugMail::encode_message is called for the message header & body (line
636/660).
encode_message calls MIME::Parser::parse_data (line 668) on the headers (JUST
the headers, not the body), returning a MIME::Entity. A call to
MIME::Entity::head call is executed on the returned entity (line 669), which
gives us a MIME::Head object. The utf8 character set is added to the
Content-Type header (line 675), and various headers are examined. Eventually,
we get to the body.
The quoted-printable encoding is only set on line 714, which tells us that (a)
the body is not 7-bit clean (line 709), and (b) more than half of the message is
7-bit clean (line 713-714). The quoted-printable header is added (line 714) and
the processed message is returned back to the caller (at line 636). The message
is then sent out.
The flaw seems to be at the beginning of encode_message. Throughout
encode_message, the $body is only touched to be encoded. encode_message never
checks to see if the body of the message contains multiple parts, so the headers
of those parts are never properly updated. This is unfortunate for whine.pl,
which sends out multipart/alternative messages. I would guess the solution
would be to change encode_message to check for a multipart message. Each part
in the body would then be split up, encoded (with appropriate headers inserted),
combined, and sent out.
For now, my suggestion is this: set the 'regression' keyword, change hardware to
'All', change OS to 'All', set severity to 'major', target for 2.22, and assign
to glob (the assignee of bug 126266). Also, change summary to something like
"UTF-8 encoding mangles multipart messages, breaks whine emails".
Reporter | ||
Comment 8•19 years ago
|
||
(In reply to comment #7)
> (I'm fuzzy in this area: I'm not sure why the message is not 7-bit clean,
> but this seems to be the case).
The bugs in my dtabase are in french and contains accents encoded using 8-bits
Reporter | ||
Updated•19 years ago
|
Assignee: erik → bugzilla
Severity: normal → major
Keywords: regression
OS: Windows 2000 → All
Hardware: PC → All
Summary: whine send empte e-mail → UTF-8 encoding mangles multipart messages, breaks whine emails
> The bugs in my dtabase are in french and contains accents encoded using 8-bits
the quick fix is for whine.pl to use quote-printable or base64 encoding when
constructing the mime message if it's not 7-bit clean.
i don't think i'll have time to look at this for a while, reassigning to nobody.
Assignee: bugzilla → nobody
Updated•19 years ago
|
Target Milestone: --- → Bugzilla 2.22
Reporter | ||
Comment 10•19 years ago
|
||
Another solution may be to look for a boundary in the header, replace this
boundary by a UTF-8 safe version and optionaly restore the original after
conversion.
Reporter | ||
Comment 11•19 years ago
|
||
A quick fix for the boundayr problem:
in whine.pl replace
$args->{'boundary'} = "-----=====-----" . $$ . "--" . time() . "-----";
by
$args->{'boundary'} = "----BugMail----" . $$ . "--" . time() . "-----";
this allow the e-mail to be displayed but all the 8bit caracters are
corupted...
Assignee | ||
Updated•19 years ago
|
Flags: blocking2.20.1?
Updated•19 years ago
|
Flags: blocking2.22?
Comment 13•19 years ago
|
||
another way to fix this is to call encode_message() for each of the individual mime parts, then join them together with mime boundaries and a normal message header.
this way the entire message will be 7-bit clean (as it'll already be encoded) so encode_message won't mangle the boundaries.
Comment 14•19 years ago
|
||
That'd mean the callsite were responsible to do the encoding.
How about we let the callsite hand over the parts individually, and BugMail.pm does the encoding and boundary-ing?
Updated•19 years ago
|
Flags: blocking2.22?
Flags: blocking2.22+
Flags: blocking2.20.1+
Target Milestone: Bugzilla 2.22 → Bugzilla 2.20
Comment 15•19 years ago
|
||
<bkor> justdave: bug 304885 is not a 2.20.1 blocker -- 2.20 does not have the utf8 parameter or the bugmail stuff to encode header/body for utf-8 (breaking whine)
Flags: blocking2.20.1+
Target Milestone: Bugzilla 2.20 → Bugzilla 2.22
Assignee | ||
Comment 16•19 years ago
|
||
I have started to work on this. If I don't have anything in a week I'll give it back to nobody.
Assignee: nobody → karl
Assignee | ||
Comment 17•19 years ago
|
||
OK, here we go...
The first thing I do is remove the = characters from the whine mail boundary, as they do freak out the encoding.
(In reply to comment #14)
> That'd mean the callsite were responsible to do the encoding.
> How about we let the callsite hand over the parts individually, and BugMail.pm
> does the encoding and boundary-ing?
(In reply to comment #13)
> another way to fix this is to call encode_message() for each of the individual
> mime parts, then join them together with mime boundaries and a normal message
> header.
Both of those methods would require a good bit of additional work on the part of the code creating the message. It would be nice if it would be as simple as calling MessageToMTA and letting BugMail do the work, as it partially does now. That's the idea I went with, and here is what I have:
First, I now pass the entire message into encode_message, and the entire message is now parsed by MIME::Parser. This takes care of recognizing and parsing all parts of the message. The code responsible for the actual encoding is now in a function called encode_message_entity, which takes a MIME::Entity as its parameter (and returns same). encode_message received the entity (which contains the newly-encoded data), extracts the header & body, and returns them as before.
encode_message_entity contains much of the code from encode_message, with little change. There are, however, a few notable changes:
* If a multipart message is detected, extract each part as its own entity and call encode_message_entity, instead of trying to examine a body that is not going to exist (multipart messages don't have bodies, just parts).
* Do not try to set the content-type or charset on parts that contain no body
* Do not do any actual encoding. Instead, just set the appropriate header and MIME::Tools will handle the encoding for us (since we have given it a body, not just headers)!
I tested this by creating a whining event that contained many interesting characters (ü, é, â, etc.) in both the subject & body. In all cases (with quoted-printable and base64 encoding) my mailer (Apple Mail) was able to decode and display the message. Of course, this still needs additional testing 8-)
Attachment #203747 -
Flags: review?
Assignee | ||
Updated•19 years ago
|
Status: NEW → ASSIGNED
Comment 18•19 years ago
|
||
Comment on attachment 203747 [details] [diff] [review]
Patch v1
r=glob
very nice, good job :)
nits (can be fixed on checkin):
> # read header into MIME::Entity
this comment is no longer accurate - probably best to delete it
>+ foreach my $part ($entity->parts) {
>+ my $newpart = encode_message_entity($part);
>+ push @$newparts, $newpart;
indentation
Attachment #203747 -
Flags: review? → review+
Assignee | ||
Updated•19 years ago
|
Flags: approval?
Updated•19 years ago
|
Flags: approval? → approval+
Assignee | ||
Comment 19•19 years ago
|
||
Assignee | ||
Comment 20•19 years ago
|
||
Checking in whine.pl;
/cvsroot/mozilla/webtools/bugzilla/whine.pl,v <-- whine.pl
new revision: 1.20; previous revision: 1.19
done
Checking in Bugzilla/BugMail.pm;
/cvsroot/mozilla/webtools/bugzilla/Bugzilla/BugMail.pm,v <-- BugMail.pm
new revision: 1.61; previous revision: 1.60
done
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•