Closed Bug 273884 Opened 18 years ago Closed 8 years ago

Enclosing double quotes are shown in recipient column, messes up sorting of name/address

Categories

(Thunderbird :: Mail Window Front End, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
Thunderbird 29.0

People

(Reporter: bill+mozilla, Unassigned)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [fixed by bug 842632])

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.7.5) Gecko/20041110 Firefox/1.0
Build Identifier: 

In the Recipient column, enclosing double quotes around the human name, if
present in the original RFC2822 message header, are included, incorrectly, in
the e-mail address

The problem is this: If you have messages like this:

Message 1:

    To: "Joe Bloggs" <joe.bloggs@somecorp.com>

Message 2:

    To: Joe Bloggs <joe.bloggs@somecorp.com>

Then the first one appears in the list as

    "Joe Bloggs"

and the second one appears as

    Joe Bloggs

The pragmatic problem is that if you sort the list by recipient, then the two
sets of messages to Joe Bloggs appear in differenty places in the list.

The quotes are only there to quote the human name for the purposes of RFC2822
encoding.  These two forms are in fact equivalent, and in both cases the human
readable name is:

    Joe Bloggs

I think that Thunderbird should parse a double-quoted human name to extract the
contents of the quoted string, unravelling any encoding which would thereby be
supported by that quoting mechanism, e.g. if there was a header record:

    To: "Joe \"The Kid\" Bloggs" <joe.bloggs@somecorp.com>

then I would expect to see:

    Joe "The Kid" Bloggs

shown in the "Recipient" column.

Additionally I should point out that, in my experience, Microsoft Outlook
Express gets this "right", and correctly processes quoted human names to return
the actual text of the human name.

That is not to say that everything Microsoft Outlook does is right.

What I am saying is that In My Opinion, the correct interpretation of RFC2822 is
that the quotes are there as an encapsulation mechanism and are not part of the
human readable name per se.

Reproducible: Always
Steps to Reproduce:
Use an IMAP system with both Thunderbird and Outlook Express, and send some mail
to a person whose human name doesn't need quoting with both e-mail programs.

Thunderbird doesn't quote human names which don't need quoting.  Outlook Express
always quotes.
Actual Results:  
You end up with two different, but equivalent forms of the same address in your
Sent / Sent Items folder.

Thunderbird displays these forms differently, and therefore in separate places
in a sorted list.

Expected Results:  
Thunderbird should interpret quoted human names as quoted human names, and
unravel the quoting to return the actual human name.

This would then mean that different, but equivalent forms would be treated the
same.  They would then appear identically in the list.

Sorting the list would then put all the equivalent forms together.
Summary: Enclosing double quotes are shown in on e-mail address columns, messes up sorting → Enclosing double quotes are shown in e-mail address columns, messes up sorting
Oddly, this is true only in the Recipient column, not in the Sender column.

Same problem seen in Moz 1.8a6-1130.

xref bug 189928.
Status: UNCONFIRMED → NEW
Ever confirmed: true
OS: Windows XP → Windows 2000
Summary: Enclosing double quotes are shown in e-mail address columns, messes up sorting → Enclosing double quotes are shown in recipient column, messes up sorting
> Oddly, this is true only in the Recipient column, not in the Sender column.

Confirmed.

The Sender column will be the address column which most people look at because
most people will spend most of their time looking at the Inbox and will be
interested in the Sender.

It is almost as if the Sender column has had more "programmer attention" brought
to bear on it, and the Recipient column has been left to rot.

Am I going to have to fix this myself and submit a patch?

I would rather not because thus far I haven't been in the Mozilla code and I can
imagine it being a lot of context to pick up, when for someone "in the know" it
should be a five minute fix.
(In reply to comment #2)
> Am I going to have to fix this myself and submit a patch?
> 
> I would rather not because thus far I haven't been in the Mozilla code and I
> can imagine it being a lot of context to pick up, when for someone "in the
> know" it should be a five minute fix.

It's always possible a bit of "low-hanging fruit" like this one may be will 
catch a developer's attention, but I wouldn't hold my breath.

But if you can figure out the enormous & Byzantine codebase, it's worth it -- 
because then you can fix *more* bugs!
Severity: normal → minor
I do have a fix for this - I just need to make sure it doesn't break anything else.
See bug 227723: escaped parentheses appear in the Recip. column with the 
backslash displayed.
Seamonkey bug 163794; probably should be Core:MailNews Backend, and duped.
It seems to be half a dupe of that bug. But that bug seems to just say that
sorting should ignore the quotes, while this bug is also about not displaying
unescaped quotes.
So we seem to have a number of different symptoms, all related to logical
handling of display-names:
- Don't display unescaped quotes (bug 273884)
- Don't display the backslash escape character (bug 227723)
- Ignore quotes when sorting (bug 163794)
- Display (unescaped?) quotes when string is being used as an address (bug 189928)

Maybe we need a meta? ;)
> So we seem to have a number of different symptoms, all related to
> logical handling of display-names:

> - Don't display the backslash escape character (bug 227723)
> - Ignore quotes when sorting (bug 163794)
> - Display (unescaped?) quotes when string is being used as an address
>   (bug 189928)

This bug takes a view which encompasses those bugs.
This bug has been around for a while.  Apparently a fix exists but it looks like
it hasn't been applied to the main tree (or whatever it needs to be applied to).

What's the hold up?

Is it because this column isn't shown by default in the Inbox, so the relevant
developers don't get to eat it in their dogfood?

I'm sure if anyone who did have the power to apply this did have this column
enabled in their daily-use mail reader, they'd get this fix applied up double quick.
OS: Windows 2000 → All
Hardware: PC → All
*** Bug 356252 has been marked as a duplicate of this bug. ***
Duplicate of this bug: 365379
In addition to stripping off the unneeded double quotes, I would also like to see extraneous SINGLE quotes (apostrophes) removed from around a display name.  Most of my co-workers use Outlook, and I get email from them all the time with addresses like this:  'Joe Bloggs' <joe@example.com> -- or even like this:  "'Joe Bloggs'" <joe@example.com>

The completely unnecessary single quotes end up in my address book (first name starts with an apostrophe, last name ends with an apostrophe).  This disrupts the sorting of entries in my address book, and sometimes it leads to duplicate entries (one with the extra punctuation, another without).

Let's just make sure, of course, that double quotes are reinserted at new message composition time where required by RFC 2822 -- e.g., if the display name happens to contain a comma.
Duplicate of this bug: 363673
Voting for this one as it's useful when clearing out cruft from various folders, especially Sent.
Does anybody know if anybody has done anything for fixing that bug? I wonder why not if it's been for 2 years now..
Duplicate of this bug: 378773
QA Contact: front-end
And even in latest version (2.0.0.4) not fixed. If you have lots of mails, this bug is really a bit annoying.
I've just had a look at nsMsgHeaderParser.cpp. It calls a C function msg_parse_Header_addresses(line, names, addresses) to split the addresses into names and mailboxes. Well, this function has a parameter named quote_names_p which is defaulted to true. If you set this to false then indeed, the name part of 
"Joe \"The Kid\" Bloggs" <joe.bloggs@somecorp.com>
becomes Joe "The Kid" Bloggs
instead of "Joe \"The Kid\" Bloggs"
The problem lies in nsMsgHdr::BuildRecipientsFromArray. Its called with the unquoted names of recipients and then it uses nsMsgHeaderParser::MakeFullAddress to rebuild the quotes. What's the point of doing this?
I just commented out the following code section in nsMsgHdr::BuildRecipientsFromArray:

if (headerParser)
{
   nsCString fullAddress;
   ret = headerParser->MakeFullAddress(nsnull, curName, curAddress, 
                                       getter_Copies(fullAddress));
   if (NS_SUCCEEDED(ret) && !fullAddress.IsEmpty())
   {
      allRecipients += fullAddress;
      continue;
   }
}

It turns out that this solves the problem. I've tested with mutiple recipients - even "Edit As New..." works fine. Could someone with deeper insight have a look at this issue and tell me whether this change is good or bad?
Joerq, can you attach this as a patch and ask for review?  That will help get the feedback that is needed.

potential dup s: bug 312502, bug 163794
Assignee: mscott → nobody
Severity: minor → normal
Summary: Enclosing double quotes are shown in recipient column, messes up sorting → Enclosing double quotes are shown in recipient column, messes up sorting of name/address
(In reply to comment #19)
> It turns out that this solves the problem. I've tested with mutiple recipients
> - even "Edit As New..." works fine. Could someone with deeper insight have a
> look at this issue and tell me whether this change is good or bad?

Is nsMsgHdr::BuildRecipientsFromArray used for processing received email headers?

AFAIK the nsMsgHdr::BuildRecipientsFromArray function is used for constructing email addresses for sending - not display after reception.

Therefore, by dropping the generation of quotes from the function, you would be contravening the requirements of RFC 2822 (http://tools.ietf.org/html/rfc2822 for a quick look see section A.1.2) which requires quotes to be generated in various circumstances.

We know that the nsMsgHeaderParser::MakeFullAddress is generating quotes reasonably correctly (http://mxr.mozilla.org/seamonkey/source/mailnews/mime/test/unit/test_nsIMsgHeaderParser1.js) to RFC 2822 from its examples, so I think its more how we display them not send them.

If nsMsgHdr::BuildRecipientsFromArray is doing the view processing as well as the sending, then we need to be very careful how we change it.
Flags: wanted-thunderbird3?
Blocks: 163794
Blocks: 227723
this issue is still present for the recipient column on 

Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.2pre) Gecko/20090727 Shredder/3.0b4pre ID:20090727034746
The following applies to the Thunderbird source code for v2.0.0.23

The function "msg_extract_Header_address_names" in "mozilla/mailnews/mime/src/nsMsgHeaderParser.cpp," starting at line 1169 says: "The names are *unquoted* and therefore cannot be re-parsed in any way" but on line 1185 it calls

int status = msg_parse_Header_addresses(line, &names, &addrs);

which returns *quoted* names.


The function right below it, "msg_extract_Header_address_name," on line 1253 calls

int status = msg_parse_Header_addresses(line, &name, &addr, PR_FALSE, PR_FALSE, PR_TRUE);

which correctly returns *unquoted* names. So, I changed line 1185 to

int status = msg_parse_Header_addresses(line, &names, &addrs, PR_FALSE, PR_FALSE, PR_FALSE);

This works, so far.


And just for good measure I also changed the "msg_parse_Header_addresses" function to not only remove double quotes, but also single quotes:
line 670, change

if (*s == '\"' && !quote_names_p)
to
if ((*s == '\"' || *s == '\'') && !quote_names_p)


line 678, change

if (s < mailbox_start && (*s == '\\' || *s == '\"'))
to
if (s < mailbox_start && (*s == '\\' || *s == '\"' || *s == '\''))
Tateu, could you think about making a proper patch and request review and super review as described at https://developer.mozilla.org/en/comm-central ?
Assignee: nobody → tateu
Assignee: tateu → nobody
I feel compelled to emphasize what Mark Banner says above about the required syntax for display names in headers. As I read it, RFC 2822 (like its predecessor RFC 822) requires all but the simplest display names to be enclosed in double quotes or "quoted printable" encoded. Even spaces in the display name trigger this requirement.

Correct:
"John Doe" <joe@doe.org>

Incorrect:
John Doe <joe@doe.org>

No Thunderbird (through 3 beta 2) or Seamonkey (through 2.0) produces compliant headers as far as this requirement goes. This seems pretty egregious.

I absolutely agree that double quotes that *must appear* in the header are artifacts of the encoding and *should never* be displayed in the MUA except when viewing source.

I suggest giving this priority since it is about standards compliance.
(In reply to comment #25)
> As I read it, RFC 2822 (like its
> predecessor RFC 822) requires all but the simplest display names to be enclosed
> in double quotes or "quoted printable" encoded. Even spaces in the display name
> trigger this requirement.
> 
> Correct:
> "John Doe" <joe@doe.org>
> 
> Incorrect:
> John Doe <joe@doe.org>

You've read it wrong. RFC 5322 (which is the latest version, but I suspect RFC 2822 has the same) shows examples in appendix A.

Appendix A.1.1. shows an example message which has correct formatting:

----
From: John Doe <jdoe@machine.example>
To: Mary Smith <mary@example.net>
Subject: Saying Hello
Date: Fri, 21 Nov 1997 09:55:06 -0600
Message-ID: <1234@local.machine.example>

This is a message just to say hello.
So, "Hello".
----

Hence NO quotes are required for the simple display name case of John Doe.
Mark Banner's latest comment is right and mine is wrong. Sorry about that. I'm not sure I'm the only one who has been confused by this.

For those that might need additional persuasion, I now think the parse tree for the display name "Mary Smith" (without the quotes) looks like this:

[display-name
	[phrase
		[word
			[atom
				[atext "Mary"]
				[CFWS [FWS " "]]
			]
		]
		[word
			[atom
				[atext "Smith"]
			]
		]
	]
]
This is still current in icedove 3.1.13 and thunderbird 3.1.15. Any progress on this?
There are already some patch proposals here. Are they usable?

Also, can't the good code fixing this for Sender column be applied to the Recipient too?
Whiteboard: [patch love][has draft patch]
(In reply to :aceman from comment #29)
> There are already some patch proposals here. Are they usable?

can you try ?
 
> Also, can't the good code fixing this for Sender column be applied to the
> Recipient too?

Probably too :D
(In reply to David :Bienvenu from comment #4)
> I do have a fix for this - I just need to make sure it doesn't break
> anything else.

What happened to the fix? :)
As a side effect of my massive stabbing of nsIMsgHeaderParser for being an annoyingly bad interface, I have a very good chance of fixing this bug.
(In reply to Joshua Cranmer [:jcranmer] from comment #32)
> As a side effect of my massive stabbing of nsIMsgHeaderParser for being an
> annoyingly bad interface, I have a very good chance of fixing this bug.

Are there chances that this bug is being fixed before it becomes 10 years old? It is still present in Thunderbird Version 24.6.
This should be fixed in version 31 (see bug 842632).
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
After a quick test, I can confirm that Thunderbird 31 [1] is correctly displaying the double quotes.

I did not do any extensive testing and no real sorting though.

Thanks for fixing this issue!

Cheers,
Raoul

[1] Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.0 ID:20140715165256 CSet: 4ab5e7d55ce4
Whiteboard: [patch love][has draft patch] → [fixed by bug 842632]
Target Milestone: --- → Thunderbird 29.0
Flags: wanted-thunderbird3?
Duplicate of this bug: 599834
You need to log in before you can comment on or make changes to this bug.