Closed Bug 374424 Opened 17 years ago Closed 16 years ago

unnecessary line breaks and excessive indentation in email subject lines

Categories

(Bugzilla :: Email Notifications, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
Bugzilla 3.0

People

(Reporter: bmo, Assigned: mkanat)

References

Details

(Keywords: regression)

Attachments

(4 files, 1 obsolete file)

if a bug's summary has an apostrophe in it (like this one) emails generated by it will have the Subject: line split by a newline.
Summary: apostrphe's in bug summaries lead to line breaks in email subject lines → apostrophe's in bug summaries lead to line breaks in email subject lines
Yup, whadaya know...  at least it indents it so it doesn't break the header.

Date: Sun, 18 Mar 2007 19:46:02 -0700
From: bugzilla-daemon@mozilla.org
To: justdave@mozilla.com
Subject: [Bug 374424] New: apostrphe'
 s in bug summaries lead to line breaks in email subject lines
X-Bugzilla-Reason: AssignedTo
Assignee: justdave → email-notifications
Component: Bugzilla: Other b.m.o Issues → Email Notifications
Product: mozilla.org → Bugzilla
QA Contact: reed → default-qa
Summary: apostrophe's in bug summaries lead to line breaks in email subject lines → apostrophes in bug summaries lead to line breaks in email subject lines
Target Milestone: --- → Bugzilla 3.0
Version: other → 3.0
Flags: blocking3.0+
bmo's relevant mail settings:

mail_delivery_method: Sendmail
sendmailnow: On
(In reply to comment #2)
> mail_delivery_method: Sendmail
> sendmailnow: On

i use qmail as MTA, same setting and i got:
Subject: [Bug 1] test ' test ' test  ' test
This is definitely a problem for my client.
hrm..  even with the apostrophe removed from the summary for this bug, i'm still seeing an unnecessary line break that wasn't there in older bugzilla versions.  the apostrophe makes long lines break at it, but it looks like long summaries in general cause broken headers.

victory: can you retry your test with a longer summary?
From what I heard on IRC, newlines in the subject are valid.
(In reply to comment #6)
> From what I heard on IRC, newlines in the subject are valid.

It sure is breaking mail clients, though (such as mine). Do you have some paragraph from an RFC that says it is ok to have newlines in the subject?
I guess section 3.1.3 is the section to read: http://www.ietf.org/rfc/rfc0822.txt
Newlines in the subject are certainly valid. However, apostrophes shouldn't cause the whole subject to break there, unless there's something about that in the RFC. This sounds like a bug in Email::Send or one of the other Email modules. I've emailed their maintainer.
I talked to the Email:: maintainer. It's not Email::MIME or Email::Simple doing it. He wrote a test and can't reproduce it.

It's proper that header lines are broken after 72 (or maybe 80?) characters, that's RFC-standard behavior. It's not proper that they're broken randomly on any apostrophe.

My current suspect is Email::Send::Sendmail--something having to do with the Sendmail interface.

It only affects Sylpheed poorly, it seems, and other clients just get a space in the subject line. So this isn't a blocker, but still something we should look into.
Flags: blocking3.0+ → blocking3.0-
Assignee: email-notifications → mkanat
I am the maintainer of Email::(LotsOfStuff).  I sent this to Max Kanat-Alexander, but now I'm saying it here.

In Email::Send::Sendmail, the sending code is this:                             
                                                                                
    open $pipe, "| $mailer -t -oi @args"                                        
        or return failure "Error executing $mailer: $!";                        
    print $pipe $message->as_string                                             
        or return failure "Error printing via pipe to $mailer: $!";             
    close $pipe                                                                 
        or return failure "error when closing pipe to $mailer: $!";             
                                                                                
In other words, Email::Send::Sendmail can't break the message, because nothing  
munges the output of as_string, and we've established that as_string is okay.   
Nothing else in Email::Send::Sendmail looks at the message object.              
                                                                                
As for Email::Send itself, it's pretty hard to imagine what there could be      
doing anything.  If a message_modifier argument was supplied, of course, that   
could be it... but it's such a rarely-used option that I'd be surprised if you  
were using it in there anywhere.  If you are using a very old Email::Send or    
Email::Abstract, perhaps there is some weird ancient bug.  Still, it seems hard 
to imagine.                                                                     

If you tell me where the offending code is -- I looked at the LXR for Bugzilla, but didn't see anything promising -- I can see what I can tell.
Ricardo wrote again. He says:

--------------------------
I saw your update to the bug, and I thought I'd chime in.  (I should probably
get a login for your bugzilla install and comment, but I'm lazy and using
anything interactive over my cell modem is... painful.)

In Email::Send::Sendmail, the sending code is this:

    open $pipe, "| $mailer -t -oi @args"
        or return failure "Error executing $mailer: $!";
    print $pipe $message->as_string
        or return failure "Error printing via pipe to $mailer: $!";
    close $pipe
        or return failure "error when closing pipe to $mailer: $!";

In other words, Email::Send::Sendmail can't break the message, because nothing
munges the output of as_string, and we've established that as_string is okay.
Nothing else in Email::Send::Sendmail looks at the message object.

As for Email::Send itself, it's pretty hard to imagine what there could be
doing anything.  If a message_modifier argument was supplied, of course, that
could be it... but it's such a rarely-used option that I'd be surprised if you
were using it in there anywhere.  If you are using a very old Email::Send or
Email::Abstract, perhaps there is some weird ancient bug.  Still, it seems hard
to imagine.

Good luck.
--------------------------
the apostrophe behavior and general strangeness with long summaries also affects windows eudora users negatively.  here's how this bug currently comes to me:
--
Subject: [Bug 374424]
 apostrophes in bug summaries lead to line breaks in email subject lines
--
this means that in my folder summary view, i only see the bug number and not any other part of the subject.  this is new behavior since the recent b.m.o upgrade so i'd consider it a regression
Keywords: regression
(In reply to comment #13)
> Subject: [Bug 374424]
>  apostrophes in bug summaries lead to line breaks in email subject lines

  That's actually a perfectly valid subject line. Eudora is failing to handle wrapped headers, a standard part of email. In fact, headers longer than 78 characters are an RFC violation. You might want to write them and let them know that you're experiencing this problem, and see what they have to say.
What RFC do you think is violated by long header fields?  Header folding is presented as an option to improve human-readability, and is not mandated.  (RFC 822, 3.4.8)

78 is only recommended as a length, and even then only in RFC 2822.  The hard limit for headers is 998 characters, again only in RFC 2822, 3.5.

If the problem here is header wrapping being done somewhere in the code, you could just turn it off.  Pretty much nobody will be affected.
(In reply to comment #15)
> What RFC do you think is violated by long header fields? [snip]

  Oh, oops! :-) Sorry. I thought it was a MUST, not a MAY.

> If the problem here is header wrapping being done somewhere in the code, you
> could just turn it off.  Pretty much nobody will be affected.

  It's not being done by Bugzilla anywhere, certainly. If you're curious about our mailer code, it's here:

  http://lxr.mozilla.org/mozilla/source/webtools/bugzilla/Bugzilla/Mailer.pm

  And the template for the mail that's being sent:

  http://lxr.mozilla.org/mozilla/source/webtools/bugzilla/template/en/default/email/newchangedmail.txt.tmpl

I think the best first step here is going to be to dump $msg when MessageToMTA is called, then dump $email->as_string as it goes along.  Without a recent Bugzilla installation to play with, I leave this to you guys.  Let me know if you narrow it down, and to where.
(In reply to comment #14)
> Eudora is failing to handle
> wrapped headers

this may be, but bugzilla users weren't negatively affected by this until recently.

> You might want to write them and let them know
> that you're experiencing this problem, and see what they have to say.

what do you think that would accomplish?  would that speed them along in their quest for their codebase conversion? :P

http://www.mozilla.com/en-US/press/mozilla-2006-10-11.html
The culprit is Encode::encode(), see Bugzilla::Mailer at line 69 (as of today's tip version of the file):

    # Encode the headers correctly in quoted-printable
    foreach my $header qw(From To Cc Reply-To Sender Errors-To Subject) {
        if (my $value = $email->header($header)) {
            my $encoded = encode('MIME-Q', $value);
            $email->header_set($header, $encoded);
        }
    }

$value contains no newline, but $encoded has one. It's easy to check by adding some debug lines here. So Email::* is fine; blame encode().
That makes this essentially a bug in perl itself, since Encode ships with Perl. Drat.
This is not a bug in perl, despite what you say, for a number of reasons.  The most important one is that Encode is "dual-lived," meaning that it is available apart from new releases of perl.  When this bug is fixed by Dan Kogai, the Encode maintainer, it can be released to the CPAN as a new version, which can then be required by Bugzilla.

In the meantime, you can probably use MIME::Words, as follows:

Instead of:

  my $string = "[Bug 374424] New: apostrphe's in bug summaries lead to line breaks in email subject lines";
  my $encoded = encode('MIME-Header', $string);
  print "$encoded\n";

You'd write:

  my $string = "[Bug 374424] New: apostrphe's in bug summaries lead to line breaks in email subject lines";
  my $encoded = encode_mimewords("[Bug 374424] New: apostrphe's in bug summaries lead to line breaks in email subject lines");
  print "$encoded\n";

having already "use MIME::Words".
I have created a bug in the Encode RT.

  http://rt.cpan.org//Ticket/Display.html?id=27395
Thank you, Ricardo! :-) Thank you for the MIME::Words solution!! :-) I was looking for what module to use, and I didn't see one in core Perl that I could easily replace Encode with.

I imagine that one of the modules we require already pulls in MIME::Words, since it seems to be installed on my testing servers. :-) I'll make sure of that, though (unless you can confirm it).
(In reply to bug 397414 comment #2)
> This behavior (odd newlines seems to have begun in June

When we upgraded to 3.0.1?

> but the additional
> space characters began just in the last week or 2 (or 3).  
> It seems to have suddently gotten MUCH worse today.  

See bug 365713 comment 38.  We just upgraded the Email::* modules to the
current release versions in response to troubleshooting requests on that bug.
Apostrophes are not needed to experience this problem.  
See the many examples in bug 397414

The problems are
a) unnecessary new lines, and 
b) more than one space character used to indent subject continuation line.
Summary: apostrophes in bug summaries lead to line breaks in email subject lines → unnecessary line breaks and excessive indentation in email subject lines
> b) more than one space character used to indent subject continuation line.

And especially this violates RfC 2822, section 2.2.3 (RfC 822, section 3.1.1, is inconsistent, since folding and unfolding may change the value there).
(In reply to comment #23)
> I imagine that one of the modules we require already pulls in MIME::Words,
> since it seems to be installed on my testing servers.

No, MIME::Words is part of MIME-tools, which is an optional module:

The following Perl modules are optional:
...
Checking for          MIME-tools (v5.406)  ok: found v5.420

We should really fix the Encode() problem I reported in comment 19. Everyone using b.m.o has seen how bad email subjects now are. Not only this breaks threading in Thunderbird and probably in some other mail clients, but it also generates weird subjects: "don' t", "whine. pl", etc...
Flags: blocking3.1.3?
Flags: blocking3.0.3?
Would this fix the problem? I know MIME-tools is an optional module, but this is only to know where to go.
Comment on attachment 283842 [details] [diff] [review]
possible fix (assuming MIME-tools would be mandatory)

You know, just do an is_7bit_clean check on each header, and that way we won't encode them if they're just ASCII, right? That would solve most problems.
Flags: blocking3.1.3?
Flags: blocking3.1.3+
Flags: blocking3.0.3?
Flags: blocking3.0.3+
Attached file For rjbs: Input Email
Here is what Bugzilla sends to Email::MIME.
Attached file For rjbs: Result Email
And here's what Email::MIME produces on the call to as_string.

I have Email::MIME 1.861 and Email::MIME::Modifier 1.442.
Attached patch v1Splinter Review
Um, duh. I don't know why we didn't think of this immediately.
Attachment #283842 - Attachment is obsolete: true
Attachment #290163 - Flags: review?(LpSolit)
Status: NEW → ASSIGNED
Comment on attachment 290163 [details] [diff] [review]
v1

Even if I mostly understand what you try to do, I'm not confident enough to review it, i.e. saying that it's the right thing to do.

I will let wurblzap review it as he reviewed the original patch about utf8.
Attachment #290163 - Flags: review?(LpSolit) → review?(wurblzap)
(In reply to comment #35)
> (From update of attachment 290163 [details] [diff] [review])
> Even if I mostly understand what you try to do, I'm not confident enough to
> review it, i.e. saying that it's the right thing to do.
> 
> I will let wurblzap review it as he reviewed the original patch about utf8.
> 

Err... sorry, I was talking about another bug. I put this comment in the wrong tab. :)
Comment on attachment 290163 [details] [diff] [review]
v1

r=LpSolit assuming it's legal to have very long lines for email subjects.
Attachment #290163 - Flags: review?(wurblzap) → review+
http://www.rfc-editor.org/rfc/rfc2822.txt has a 998 character limit before you must "fold" the line.  It also mentions that a 78 character limit may be imposed by some clients.
Yep, I saw this 998 character limit after commenting. So we are fine as bug summaries are limited to 255 characters (varchar(255)).
Flags: approval?
Flags: approval3.0?
tip:

Checking in Bugzilla/Mailer.pm;
/cvsroot/mozilla/webtools/bugzilla/Bugzilla/Mailer.pm,v  <--  Mailer.pm
new revision: 1.15; previous revision: 1.14
done

3.0:

Checking in Bugzilla/Mailer.pm;
/cvsroot/mozilla/webtools/bugzilla/Bugzilla/Mailer.pm,v  <--  Mailer.pm
new revision: 1.7.2.6; previous revision: 1.7.2.5
done
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Flags: approval?
Flags: approval3.0?
Flags: approval3.0+
Flags: approval+
Resolution: --- → FIXED
Keywords: relnote
The committed fix does not appear to be working for the case where the extra linebreak is being introduced after an apostrophe.  Part of the reason is that the regex doesn't match since only a single space is added after the newline, instead of two.  (A linebreak followed by two spaces occur only when the linebreak was inserted by Encode at a place where there was just one space previously.)

My proposed change is this:

$encoded =~ s/\n //g;
Jeff, thanks for pointing this out. RESOLVED bugs are not on anybody's radar, though, and additional fixes won't happen here. Please file a separate bug.
reopening instead.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attachment #296186 - Attachment mime type: application/octet-stream → text/plain
My two cents, take it or leave it: if a bug led to a patch that landed and the bug was closed, even if the patch was faulty, unless it is to be backed out and there are no merge conflicts, filing a new bug is almost always better in my experience.

Whenever I've succumbed to the temptation not to file a new bug, but reopen or keep open the old one, I've regretted it sooner or later -- if only for auditing hygiene of separate bug numbers for separate committed patches, but also because it keeps bugs shorter and separates concerns spread across time, where thinking on the bug as well as the "right" fix often shifts.

HTH,

/be
In method 3 of my test case script, using 998 instead of 1024 might be better according to comment 38.
Please do not reopen this bug, as mentioned in comments 42 and 45. File a separate bug instead with your testcase attached.
Status: REOPENED → RESOLVED
Closed: 17 years ago16 years ago
Resolution: --- → FIXED
Blocks: 411544
Ok thanks, I've opened bug 411544 to track the improvement.
I forget if we relnoted this for 3.0.3, but that release is long past now, anyway.
Keywords: relnote
(In reply to comment #49)
> I forget if we relnoted this for 3.0.3, but that release is long past now,
> anyway.

We relnoted it for 3.0.4, see http://www.bugzilla.org/releases/3.0.4/release-notes.html#v30_point
You need to log in before you can comment on or make changes to this bug.