Closed Bug 723944 Opened 13 years ago Closed 13 years ago

Plain-text only emails are mangled when they contain non-ASCII characters

Categories

(Bugzilla :: Email Notifications, defect)

defect
Not set
major

Tracking

()

RESOLVED FIXED
Bugzilla 4.2

People

(Reporter: philb, Assigned: LpSolit)

References

Details

(Keywords: regression)

Attachments

(2 files)

User Agent: Opera/9.80 (X11; Linux x86_64; U; en-GB) Presto/2.10.229 Version/11.61 Steps to reproduce: Have your preferences set to plain-text emails only Create a bug, or comment on a bug In the description/comment, include non-ASCII characters (e.g. "┌────") Actual results: No email is sent Expected results: Email should have been sent. Note that if HTML email is chosen, it is sent, along with the plain-text version. It's only when you have plain-text-only that nothing is sent.
This is in 4.2rc2.
utf8 param is on, and this worked fine in 3.6.
I tested with 4.3 using SMTP. Emails are sent when the email format is plain text, but the content is total garbage. For instance: HTML format: testостанов] - неожиданное падение plain text only: testоÑтанов] - неожиданное падение I assume this is a different symptom of the same root cause, so marking as blocker.
Severity: normal → major
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: blocking4.2+
Target Milestone: --- → Bugzilla 4.2
Phil, did you try with SMTP too, or did you try with Sendmail?
> Phil, did you try with SMTP too, or did you try with Sendmail? My config is SMTP (with jobqueue). Didn't try sendmail.
This is a regression due to bug 589128. When the user pref has been implemented, multipart/alternative has been replaced by $parts[0]->content_type, i.e. text/plain, for text only bugmails. If I set the content-type back to multipart/alternative, the display is fine again. Phil, could you test something for me? In the file Bugzilla/BugMail.pm at line 406, replace if (scalar(@parts) == 1) { by if (0 && scalar(@parts) == 1) { Does this fix the problem for you? If yes, then this is indeed the same root cause.
Depends on: 589128
Keywords: regression
Ah, I found something interesting. In MessageToMTA(), we call $email->walk_parts to fix the encoding, but only if the Content-Type header doesn't define the charset itself. This is why we do this check: my $content_type = $part->content_type || ''; if ($content_type !~ /;/) { So I checked the value of $content_type with text only emails and with text+HTML emails: Text only: Content-Type: text/plain; charset="us-ascii" text+HTML: Content-Type text/plain Content-Type text/html So with text+HTML emails, the Content-Type header has no charset defined, and we do the encoding as expected. But in the case of text only emails, charset="us-ascii" is present in the Content-Type header and so we skip the encoding, leading to the garbage I see.
The problem comes from: $email->content_type_set($parts[0]->content_type); $parts[0]->content_type correctly returns 'text/plain', but right after this call, $email->content_type returns text/plain; charset="us-ascii". So for some unknown reason, $email->content_type_set appends charset="us-ascii" despite we didn't tell it to do so. @rjbs: is this behavior expected?
> Does this fix the problem for you? If yes, then this is indeed the same root > cause. Yes, that fixes it.
my patch on bug 714724 fixes this by removing the walk_parts hack and setting the headers correctly before passing the message to Email::MIME. do you think that's something we could target for 4.2?
(In reply to Byron Jones ‹:glob› from comment #10) > do you think that's something we could target for 4.2? I'm not sure. This is a pretty major change, and we are very close from the final release of 4.2. Your patch could have unexpected side-effects, especially with those not using utf8 or with MTAs which do not support 8bit correctly, and I wouldn't want to deal with such problems for 4.2.
Attached patch patch, v1Splinter Review
Here is a workaround which seems safer for 4.2. For 4.4, we can go with what glob suggested.
Assignee: email-notifications → LpSolit
Status: NEW → ASSIGNED
Attachment #597414 - Flags: review?(glob)
Comment on attachment 597414 [details] [diff] [review] patch, v1 r=glob kludge, but it works.
Attachment #597414 - Flags: review?(glob) → review+
Flags: approval4.2+
Flags: approval+
Committing to: bzr+ssh://lpsolit%40gmail.com@bzr.mozilla.org/bugzilla/trunk/ modified Bugzilla/Mailer.pm Committed revision 8114. Committing to: bzr+ssh://lpsolit%40gmail.com@bzr.mozilla.org/bugzilla/4.2/ modified Bugzilla/Mailer.pm Committed revision 8029.
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Summary: Plain-text email not sent when it contains non-ASCII chars → Plain-text only emails are mangled when they contain non-ASCII characters
I apologize if I'm making a superfluous comment, but are values in headers also being encoded properly? I'm examining today whether I need to add to our 3.6-based version what we have in our 2.22-based version, because we put a user's name in the From: of outgoing emails: sub encoded_name { my $self = shift; if (!Param('utf8') || is_7bit_clean($self->{name})) { return $self->{name}; } else { my $encoded = encode_qp($self->{name}, ''); return "=?UTF-8?Q?${encoded}?="; } } I'm thinking about adding an 'qp_encoded' filter to handle this, but I'm also wondering whether other header fields should be encoded too.
(In reply to David Marshall from comment #15) > I apologize if I'm making a superfluous comment, but are values in headers > also being encoded properly? I'm examining today whether I need to add to > our 3.6-based version what we have in our 2.22-based version, because we put > a user's name in the From: of outgoing emails: > > sub encoded_name { > my $self = shift; > > if (!Param('utf8') || is_7bit_clean($self->{name})) { > return $self->{name}; > } else { > my $encoded = encode_qp($self->{name}, ''); > return "=?UTF-8?Q?${encoded}?="; > } > } > > I'm thinking about adding an 'qp_encoded' filter to handle this, but I'm > also wondering whether other header fields should be encoded too. Yikes, never mind. There really is a lot of difference between 2.22 and 3.6
This is not fixed in current 4.2. No plain-text emails with non-ASCII chars are being sent in my install. text+HTML format email is better in that it now has correct charset: Content-Type: text/plain; charset="UTF-8"
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(In reply to Phil Barrett from comment #17) > No plain-text emails with non-ASCII chars are being sent in my install. If you disable JobQueue, do things work correctly? Also, with JobQueue disabled, could you enable the smtp_debug parameter to check where the problem is?
> If you disable JobQueue, do things work correctly? Yes. > Also, with JobQueue > disabled, could you enable the smtp_debug parameter to check where the > problem is? I'll attach the log. I can't see anything obviously wrong.
Attached file Log for comment 19
(In reply to Phil Barrett from comment #19) > > If you disable JobQueue, do things work correctly? > > Yes. OK, so this problem is fixed when not using JobQueue. Could you file a separate bug about JobQueue?
Status: REOPENED → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: