Closed Bug 672424 Opened 13 years ago Closed 11 years ago

quoted-printable subject field not parsed

Categories

(Thunderbird :: Untriaged, defect)

All
Other
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: jamundso, Unassigned)

References

(Blocks 1 open bug)

Details

User Agent: Mozilla/5.0 (X11; Linux i686; rv:5.0) Gecko/20100101 Firefox/5.0
Build ID: 20110622105737

Steps to reproduce:

Scanned inbox.


Actual results:

Immediately noticed corrupt Subject.


Expected results:

Pretty subject.
Adherence to RFC's is a nice guideline to strive for, but frankly, Outlook, Gmail, and Evolution all display my use case just fine, generated by M$ Dynamics CRM sadly.
Works fine for me with a "Subject: =?ISO-8859-1?Q?...?=" heading as received.
Can you provide an example of a correct encoding which is displayed wrong?
(In reply to comment #1)
> Works fine for me with a "Subject: =?ISO-8859-1?Q?...?=" heading as received.
> Can you provide an example of a correct encoding which is displayed wrong?

What you see there is wrong - only the ... between the last two ?'s should be displayed as the Subject, as that's what would have been entered.
That was the encoding seen in View > Message Sources, it displays correctly.
There may be a subtle issue in the way that the messages showing wrong are encoded, hence I asked you to provide an example of such an encoded Subject. Given your somewhat snarky reference to RFCs in your initial description, I thought it was obvious to you what I'm talking about.
OK, touché.
The great thing is now I can't reproduce it either. I'll do some digging to see how the broken messages were created.
Duh, I remember now - it's just Subject's which violate RFC 2047, hence my earlier RFC comment.
For example...

Subject: =?us-ascii?Q?TECHP -- Aajax Paper Testing - Aberdeen, UT
 CASE#(CAS-01932-Z259) -- Lunch?=
We have around three or four other outstanding bugs on this (maybe--some of them may just be invalid characters in the charset); we should consolidate them into one bug and fix it as a result of the "be liberal in what you accept" maxim.
Whiteboard: dupeme
(Quoting bug 521238 comment #3 on a similar issue)
> <http://tools.ietf.org/html/rfc2047>
> 
> Encoded words may not contain embedded spaces, and they are considered as
> atom productions in message envelopes, so CWFS cannot appear in the middle
> of the word. So your header is invalid per RFC 2047.

Yes, so strictly speaking this is invalid (and the other bug resolved as such). Spaces in the message I've looked at (which displays correctly) were replaced by '_' characters [Section 4.2,(2)], which isn't the case in your example either (this example is a bit nonsense anyway, "us-ascii" is 7bit per definition and won't need quoted-printable encoding, and no 8-bit character is actually used).

Plenty of bugs related to RFC2047 violations, I don't see a good candidate for a master bug to consolidate, thus adding Wayne and Ludo to the CC list.
(In reply to comment #7)
> Yes, so strictly speaking this is invalid (and the other bug resolved as
> such). Spaces in the message I've looked at (which displays correctly) were
> replaced by '_' characters [Section 4.2,(2)], which isn't the case in your
> example either (this example is a bit nonsense anyway, "us-ascii" is 7bit
> per definition and won't need quoted-printable encoding, and no 8-bit
> character is actually used).

The syntax may be invalid, but some snooping around for the last comment I saw indicates that we have had a few of these fly by over the years, and it appears that other email clients also accept whitespace, so people have used these other the years.

My memory is very fuzzy, but I think most of our RFC 2047 decoding is centralized outside the core craziness of libmime, so it should be a simplish fix. However, I also recall that libmime's architecture means that a newline in the middle of things can often cause problems,
(In reply to comment #7)
 
> Plenty of bugs related to RFC2047 violations, I don't see a good candidate
> for a master bug to consolidate, thus adding Wayne and Ludo to the CC list.

There's more than one issue. I'll create a meta and we'll build the tree of issues.
Whiteboard: dupeme
Blocks: RFC2047
Component: General → Untriaged
Do you still see this in version 17 or newer? 
If not, please close by setting status to resolved, and resolution to worksforme.
If it fails, please supply additional information.
Thanks
Whiteboard: [closeme 2013-04-20]
Resolved per whiteboard
Status: UNCONFIRMED → RESOLVED
Closed: 11 years ago
Resolution: --- → INCOMPLETE
Whiteboard: [closeme 2013-04-20]
You need to log in before you can comment on or make changes to this bug.