Last Comment Bug 659355 - Cannot open PDF attachment in some mails (Content-Type: =?windows-1252?q?application/pdf)
: Cannot open PDF attachment in some mails (Content-Type: =?windows-1252?q?appl...
Status: VERIFIED FIXED
[STR comment #15]
:
Product: MailNews Core
Classification: Components
Component: MIME (show other bugs)
: Trunk
: All All
: -- normal with 35 votes (vote)
: Thunderbird 17.0
Assigned To: Kent James (:rkent)
:
:
Mentors:
: 580971 685112 738284 (view as bug list)
Depends on: 503309
Blocks: tb-enterprise 739025
  Show dependency treegraph
 
Reported: 2011-05-24 09:52 PDT by Arn
Modified: 2013-06-25 02:45 PDT (History)
40 users (show)
rkent: in‑testsuite+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
mimeTypes.rdf of a TB creating buggy PDF attached mail (4.07 KB, application/octet-stream)
2011-05-25 08:51 PDT, Arn
no flags Details
Thunderbird Content/MIME Type conditionnal flow (427.70 KB, image/png)
2012-02-07 06:54 PST, vincent.lucas
no flags Details
Patch to fixe mimetype corruptiion. (1.10 KB, patch)
2012-05-16 03:03 PDT, patches
mozilla: review-
Details | Diff | Splinter Review
PATCH from version 13.0.1 (4.53 KB, patch)
2012-07-18 02:12 PDT, Marek Jagielski
no flags Details | Diff | Splinter Review
Correct contentType if invalid or generic (12.02 KB, patch)
2012-07-23 14:12 PDT, Kent James (:rkent)
no flags Details | Diff | Splinter Review
fix in mimei.cpp (7.54 KB, patch)
2012-07-24 15:51 PDT, Kent James (:rkent)
no flags Details | Diff | Splinter Review
Rev C - truncate bad stuff if '?' detected (7.14 KB, patch)
2012-07-25 12:49 PDT, Kent James (:rkent)
Pidgeot18: review+
Details | Diff | Splinter Review
unbitrotted patch landed (7.17 KB, patch)
2012-08-01 11:45 PDT, Kent James (:rkent)
rkent: review+
Details | Diff | Splinter Review

Description Arn 2011-05-24 09:52:41 PDT
User-Agent:       Mozilla/5.0 (X11; U; Linux i686; fr; rv:1.9.2.17) Gecko/20110422 Ubuntu/10.04 (lucid) Firefox/3.6.17
Build Identifier: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; fr; rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10

Mail with PDF attachment created and sent with Thunderbird 3.1.10

Sometimes (often in my case but not always), the PDF file cannot be viewed directly by double-clicking on the attachment. It launches the correct pdf-reader but the reader displays an error message. [The problem arises with the copy saved in the send folder of the sender as well as in the mailbox of a receiver using Thunderbird 3.1.10 - in others email client or webmail, sometimes it works, sometimes not...]

In the source of all the "bad" mails that I have checked, the content-type is as follow :

Content-Type: =?windows-1252?q?application/pdf;

if replaced (in an exported eml file) by :

Content-Type: application/pdf;

the PDF attachment is opened correctly...

Reproducible: Couldn't Reproduce

Steps to Reproduce:
Can not find what triggers what seems like a badly formated mail
Comment 1 rsx11m 2011-05-24 11:14:08 PDT
> Content-Type: =?windows-1252?q?application/pdf;

This doesn't appear to be a valid syntax. According to RFC 2045 Section 5.1, "Content-Type:" if followed by type "/" subtype, both of which are tokens.

http://tools.ietf.org/html/rfc2045#section-5.1

Thus, even if charset definitions were allowed, RFC 2047 Section 2 would require them to have a "=?" charset "?" encoding "?" encoded-text "?=" syntax, implying "=?windows-1252?q?application?=/pdf;" at best.

http://tools.ietf.org/html/rfc2047#section-2

> The problem arises with the copy saved in the send folder of the sender

Are you saying that you send messages out this way yourself? If that's the case, go into Tools > Options > Attachments, identify and highlight/select the respective entry for PDF files, and select "Delete Action" from the menu. This should remove any wrong application type Thunderbird may have picked up.
Comment 2 Arn 2011-05-24 13:37:51 PDT
(In reply to comment #1)

> Are you saying that you send messages out this way yourself? If that's the
> case, go into Tools > Options > Attachments, identify and highlight/select
> the respective entry for PDF files, and select "Delete Action" from the
> menu. This should remove any wrong application type Thunderbird may have
> picked up.

Never had the problem with a pdf I've sent myself but I received some with this problem. It seems to me that the only attachment setting dealing with PDF in my Thunderbird is OK : application/pdf => evince (my OS is Ubuntu - my users Windows7 mostly). I will check Windows later. I may post a few samples of the mimeTypes.rdf if it seems relevant.

Does Thunderbird make any use of theses attachment settings while building a new mail with an attached file ? I would guess it's used only while "reading" a mail, isn't it ?
Comment 3 Arn 2011-05-24 13:57:29 PDT
(In reply to comment #1)

> Thus, even if charset definitions were allowed, RFC 2047 Section 2 would
> require them to have a "=?" charset "?" encoding "?" encoded-text "?="
> syntax, implying "=?windows-1252?q?application?=/pdf;" at best.
> 
> http://tools.ietf.org/html/rfc2047#section-2

I've try to replace in an eml file :
--> Content-Type: =?windows-1252?q?application/pdf;
by
--> Content-Type: =?windows-1252?q?application?=/pdf;
It doesn't work better in my Thunderbird.

--> Content-Type: application/pdf;
is still the only "good enough" format so far to me.


Someone who knows the details of the builduing of a new mail with attachment in TB may found interest in this bug...
Comment 4 rsx11m 2011-05-24 14:38:22 PDT
(In reply to comment #2)
> Does Thunderbird make any use of theses attachment settings while building a
> new mail with an attached file ?

Yes, it uses the file-type association when opening an attachment of a specific type and checking "do this from now on" in the dialog, thus it's fairly easy to pick up a wrong association and then sending out broken messages which contain attachments of the same type. Bug 503309 is supposed to prevent that.

(In reply to comment #3)
> --> Content-Type: =?windows-1252?q?application?=/pdf;
> It doesn't work better in my Thunderbird.

That's what I thought, it's unlikely that the token is anything but ASCII, but it wasn't clear to me from reading the specification.
Comment 5 Ludovic Hirlimann [:Usul] 2011-05-25 04:07:55 PDT
So is Thunderbird sending these or is it another mailer (this sounds like an nice INVALID bug :-))
Comment 6 rsx11m 2011-05-25 05:48:02 PDT
(In reply to comment #2)
> Never had the problem with a pdf I've sent myself but I received some with
> this problem.

So yes, the messages are received in a non-standard format, thus I agree to close this as INVALID (Thunderbird cannot be expected to handle that case).

If it was picked up from a received mail and sent out this way, it should be covered by bug 503309 assuming that a correct OS-registered file type exists.

Feel free to reopen this report if you disagree with its resolution or if you have further information that may suggest a bug in Thunderbird itself.
Comment 7 Arn 2011-05-25 07:24:16 PDT
As said in the first lines of the description, the buggy mails are created and sent with Thunderbird 3.1.10 (created from scratch - they are not just received or transfered through TB).
Comment 8 rsx11m 2011-05-25 07:48:06 PDT
> Mail with PDF attachment created and sent with Thunderbird 3.1.10

Heh, sorry about that, guess I got distracted by Ludo's comment #5...

Can you attach the "mimeTypes.rdf" file from one of those Thunderbird installations where the problem occurs? That file is in the profile folder, C:\Users\name\AppData\Roaming\Thunderbird\Profiles\xxxxxxxx.default on Win7.

This would help to see if it's bug 503309 or a different issue.
Comment 9 Arn 2011-05-25 08:51:54 PDT
Created attachment 535081 [details]
mimeTypes.rdf of a TB creating buggy PDF attached mail
Comment 10 rsx11m 2011-05-25 10:05:08 PDT
Yes, that's clarifying it. There are two associations with ".pdf" files, the correct one for application/pdf and the one for the malformed mimetype:

>  <RDF:Description RDF:about="urn:mimetype:=?windows-1252?q?application/pdf"
>                   NC:fileExtensions="pdf"
>                   NC:description="Adobe Acrobat Document"
>                   NC:value="=?windows-1252?q?application/pdf"
>                   NC:editable="true">
>    <NC:handlerProp RDF:resource="urn:mimetype:handler:=?windows-1252?q?application/pdf"/>
>  </RDF:Description>

Thus, I think this should be covered by bug 503309 unless we want to make the invalid syntax being picked up in a mimetype definition a separate bug.

Arn, you may want to forward the following to anybody affected to remove the wrong file-type association (from comment #1):

> go into Tools > Options > Attachments, identify and highlight/select
> the respective entry for PDF files, and select "Delete Action" from the
> menu. This should remove any wrong application type Thunderbird may have
> picked up.

However, until bug 503309 is fixed this may happen again when opening a PDF attachment from such a malformed message, even if it's just an old one...
Comment 11 rsx11m 2011-05-25 11:00:07 PDT
> Sometimes (often in my case but not always), the PDF file cannot be viewed
> directly by double-clicking on the attachment. It launches the correct
> pdf-reader but the reader displays an error message.

That part was also reported in bug 580971 where an attachment hat the invalid Content-Type: =?windows-1252?q?application/pdf; entry. It's not a dupe as the question here is how it was picked up in mimeTypes.rdf and how to prevent it from happening, but stunning that exactly the same malformed "windows-1252" definition is reported in both cases.
Comment 12 Arn 2011-05-26 03:23:52 PDT
> Arn, you may want to forward the following to anybody affected to remove the
> wrong file-type association (from comment #1):
> 
> > go into Tools > Options > Attachments, identify and highlight/select
> > the respective entry for PDF files, and select "Delete Action" from the
> > menu. This should remove any wrong application type Thunderbird may have
> > picked up.
> 
> However, until bug 503309 is fixed this may happen again when opening a PDF
> attachment from such a malformed message, even if it's just an old one...

I've checked the settings of the users who warned me about this problem. Half a dozen of them had the buggy PDF attachment settings. Once the bad ligne cleared, they were able to send mail with "correct" PDF attachment. Thanks rsx11m !

So, if I do not mistake, this buggy setting and behaviour can spread from a Thunderbird client to another TB quite easily. We don't have any vaccine but we know a cure. That's good enough for me for now.

Should the status of this bug be changed ?
Comment 13 rsx11m 2011-05-26 12:28:37 PDT
While there obviously is a bug (or two), I didn't confirm your report given that other reports exist about similar issues. Thus, you problem essentially is the combination of both bug 580971 and bug 503309.

Ludo has set some dependencies, I don't know if he also meant to confirm both unconfirmed bugs in the process. It may very well be that one solves the other.
Comment 14 Ludovic Hirlimann [:Usul] 2011-05-27 02:09:28 PDT
(In reply to comment #13)
 
> Ludo has set some dependencies, I don't know if he also meant to confirm
> both unconfirmed bugs in the process. It may very well be that one solves
> the other.
If we'd get the safe-guarding most of these issues would probably disappear. I didn't change the status to new, because I would love to have something like :
1) create fresh profile
2) do X, Y ,Z
3) attachement have corrupted mime type when sent.
Comment 15 rsx11m 2011-05-27 11:51:19 PDT
Steps to reproduce (tested with today's TB 7.0 nightly on Windows 7):

1) Get bug 580971 attachment 479004 [details] for a test message, save as file
2) Create account with new profile and empty mimeTypes.rdf
3) Open test message from saved file
4) Attachment shows up as PDF type, associated with Acrobat Reader if installed
5) Double-click opens dialog suggesting to open with Acrobat Reader
6) Check the "do this for all" box and click Ok
7) Acrobat Reader reports error (bug 580971)
8) Now open a new message, add a PDF file as attachment, and send it
9) In your "Sent" folder, observe with View > Message Source the invalid file
   type has been picked up (also verify the entry in mimeTypes.rdf)

Note that step #9 does not apply if step #6 was omitted, thus the extra step checking the box is needed for the wrong association to be picked up.

So, this is a variant of bug 503309 where a syntactically wrong MIME type should not be considered for adding it to mimeTypes.rdf.

Confirming per comment #14.
Comment 16 Frederic Besnard 2011-11-03 10:20:29 PDT
I confirmed what was written above: dozens of TBirds in my company have such corrupted mimeTypes.rdf, containing even worst than "mimetype:=?windows-1252?q?application/pdf".

As for us, it is, for example:
  <RDF:Description RDF:about="urn:mimetype:%22%22%22%22%22%22%22=?windows-1252?q?application%22%22%22%22%22%22%22/pdf"
                   NC:fileExtensions="pdf"
                   NC:description="Adobe Acrobat 7.0 Document"
                   NC:value="%22%22%22%22%22%22%22=?windows-1252?q?application%22%22%22%22%22%22%22/pdf"
                   NC:editable="true">
    <NC:handlerProp RDF:resource="urn:mimetype:handler:%22%22%22%22%22%22%22=?windows-1252?q?application%22%22%22%22%22%22%22/pdf"/>
  </RDF:Description>

Deleting the bad mimeTypes.rdf files solves the problem only if the user don't record any association with a malformed incoming mail. This very annoying problem behaves like a virus. As long as an "infected" TBird exists, the problem can arise again.
Comment 17 Vincent (caméléon) 2011-11-28 02:11:06 PST
For information, this problem has also been reported on French forums here: http://www.geckozone.org/forum/viewtopic.php?f=4&t=100209
Comment 18 Roland Tanglao :rolandtanglao 2011-11-28 16:34:49 PST
(In reply to rsx11m from comment #15)
> Steps to reproduce (tested with today's TB 7.0 nightly on Windows 7):
> 
> 1) Get bug 580971 attachment 479004 [details] for a test message, save as
> file
> 2) Create account with new profile and empty mimeTypes.rdf
> 3) Open test message from saved file
> 4) Attachment shows up as PDF type, associated with Acrobat Reader if
> installed
> 5) Double-click opens dialog suggesting to open with Acrobat Reader
> 6) Check the "do this for all" box and click Ok
> 7) Acrobat Reader reports error (bug 580971)
> 8) Now open a new message, add a PDF file as attachment, and send it
> 9) In your "Sent" folder, observe with View > Message Source the invalid file
>    type has been picked up (also verify the entry in mimeTypes.rdf)
> 
> Note that step #9 does not apply if step #6 was omitted, thus the extra step
> checking the box is needed for the wrong association to be picked up.
> 
> So, this is a variant of bug 503309 where a syntactically wrong MIME type
> should not be considered for adding it to mimeTypes.rdf.
> 
> Confirming per comment #14.

confirmed by me too. 
:rsx11m if you do "Save As..." and then save the PDF you can open the PDF in Adobe Reader right? That works for me and seems like a viable workaround until we get a fix!
Comment 19 rsx11m 2011-11-28 17:24:09 PST
Sure, saving the attachment doesn't involve mimeTypes.rdf and should save the file with the original file extension. From there, the operating system's default file handling kicks in and will hopefully do the right thing.
Comment 20 mohican 2012-02-03 08:20:26 PST
The bug is not specific to PDF files. It occurs whenever the Content-Type descriptor is not correct.
The bug does also affect some webmails, not only thunderbird.

Detailed report about user's experience of this bug + solutions on how to circumvent it is discussed here (in french) :
http://forum.ubuntu-fr.org/viewtopic.php?id=808371

One solution for opening attachements - without having to save it first - is to open the attached file by its extension rather than by its mime type. Thunderbird doesn't do that but Paolo "Kaosmos" wrote an add-on that does it. It is called OpenAttachmentByExtension and can be downloaded from here :
https://nic-nac-project.org/~kaosmos/index-en.html#openattach
Comment 21 mohican 2012-02-03 08:28:00 PST
(In reply to rsx11m from comment #19)

There are two ways to save the attached file to the disk :

1. menu "Save as..." : this does not use mimeTypes.rdf so the file will be as sent.

2. menu "Open as..." : if there is no fixed rule for handling the mime type (i.e. no entry in mimeTypes.rdf) you will get the menu that allows you to - choose the application for opening, or - save to disk, and - remember the rule.
If you save to disk from this menu, the mime type will be used, and - if not correct - the resulting file will not be readable. SO I ADVISE NOT TO USE THIS METHOD.
Comment 22 mohican 2012-02-03 08:41:49 PST
Of course if you do open files by extension (not mime types) there will be sometimes a problem too.
Because sometimes the same extension is used for different kind of files.
See for exemple bug 293804.

Anyway I think bug 503309 should really be fixed. At least it should prevent thunderbird users to send corrupt emails. How about briging it up. (As a simple user I don't now how to do more than voting for it).
Comment 23 vincent.lucas 2012-02-07 06:54:10 PST
Created attachment 595021 [details]
Thunderbird Content/MIME Type conditionnal flow

Conditional flow of unattended behavior of thunderbird.
Comment 24 vincent.lucas 2012-02-07 07:07:34 PST
Hi,
I would like to add something to this bug report. Indeed, it seems to me that the problem could become important by its contagious potential.

We have now encountered it on both Linux and Windows. The problem is currently impacting much of the users of our university. After receiving frequent incident reports, I have studied the problem in order to better understand what happens.

Find enclosed a diagram showing briefly the unattended behavior of thunderbird.
attachment 595021 [details]

Basically, the problem comes from a mail client X which has sent the wrong content-type and triggers a snowball effect. A crucial point is that this bug has been reported with pdf files but could occur with any type of document. Moreover, my concern is that the bug can be voluntarily recreated to infect the thunderbird client users.

To prevent the contagion to occur, one could implement the following approaches:

As far as sending the wrong content-type is concerned:
-A possible approach would be to process mails on the server side to replace the bad Content-Type by the valid Content-Type. This is technically feasible but cannot be performed for obvious reasons.
-An alternative is to create a file mimeType.rdf in the installation directory of Thunderbird (defaults\profile\mimTypes.rdf) with an automatic action (NC:alwaysAsk="false").

Be careful, the user will not be able to choose between “open with” or “save” when selecting a pdf file. It is recommended to “save” (NC:saveToDisk="true"). Indeed, if the action “open with” (NC:useSystemDefault="true" + define the RDF:about="urn:mimetype:handler:application/pdf) is selected, an attachment with the bad Content-Type would not be readable (see below).


As far as reading the wrong content-type is concerned:
-Developers correct the thunderbird "bug" (view attachment)
-Use of an extension based on the file extension and then correctly extract the attachment:
https://nic-nac-project.org/kaosmos~/index-en.html#openattach

As you can see, this problem could only be fixed via a thunderbird patch; all other approaches would be similar to a “bandage”.

Hoping that the issue was clarified and that the potentially damaging impact of this bug was addressed.

Best Regards.
Comment 25 jean-marc.evrard 2012-04-19 00:06:03 PDT
Hi,

I want to mention several university in different regions of France are impacted with this problem. Meaning, for my university (Avignon) a small but growing % of employees and teachers.

As mentioned previously, the only "solution" comes with the "openattachmentbyextension" extension, not satisfying, because it will lead to another disfonction when the next version of Adobe reader will be released (with a different path to adobereader.exe).

I also want to make a few positive comments, which are in no way critisms, but only facts that could help the fundation adressing some problems :

- the bug evaluation does not reflect the severity of the problem nor the growing quantity of users affected (at this time I still see a "normal" status on this problem and a steady "20 votes", on an epidemic problem). 

- some people start thinking there is a lack of reaction or interest from the fondation regarding Thunderbird, in order to concentrate on Firefox. This could lead some institutional users to migrate to corporate webmail solutions, despite the quality of TB.

I want to end these comments with friendly thoughts, sending a warm hello to contributors and thanks for the job done.
Comment 26 patches 2012-05-16 03:01:40 PDT
Hi all,

Here is a patch for this issue, tested with Thunderbird 12.0.1. This patch fixes only the case when the corruption starts with : =? and ends with q?

Regards
Comment 27 patches 2012-05-16 03:03:44 PDT
Created attachment 624331 [details] [diff] [review]
Patch to fixe mimetype corruptiion.
Comment 28 Ludovic Hirlimann [:Usul] 2012-05-16 03:07:52 PDT
Comment on attachment 624331 [details] [diff] [review]
Patch to fixe mimetype corruptiion.

I don't know if this testable - but I always love patches with tests :-)
Comment 29 David :Bienvenu 2012-05-17 15:05:19 PDT
Comment on attachment 624331 [details] [diff] [review]
Patch to fixe mimetype corruptiion.

thx for the patch. Several issues:

It looks like there are tabs. We use 2 space indent, not tabs.

The patch mixes braces style. You should consistently use the prevailing braces style in the file, which I think would look like:

if (pos)
{
}
else
{
}

When you call Find on mSContentType, -1 is the return value when we can't find the string, not 0.

It looks to me like you're leaking mContentType because ToNewUTF8String() allocates memory. Also, since it's a local variable, its name shouldn't start with m, which is reserved for member variables in a class/struct.

I also don't see why you convert the ascii content type to unicode/utf16 and then converting it back to utf8. It would be simpler just to do something like this:

nsCString cleanedUpContentType(contentType);
if (StringBeginsWith(cleanedUpContentType, NS_LITERAL_CSTRING("=?"))
{
  PRInt32 pos = cleanedUpContentType.Find("?q?");
  if (pos > 0)
     cleanedUpContentType.Cut(0, pos);
}
headerSink->HandleAttachment(cleanedUpContentType, url /* was escapedUrl */,
                                  unicodeHeaderValue.get(), uriString.get(),
                                  aIsExternalAttachment);
Comment 30 Ploué 2012-05-30 04:41:01 PDT
Hello Mister Patch,

Is it possible to get de compliled thunderbird.exe 12.01 which include patch 624331 ?
I can validate it with my 100 users.

Regards -- Serge ploue (serge.ploue@enseeiht.fr)
Comment 31 Cédric Bellegarde 2012-06-01 05:57:42 PDT
Any chance to have this in Thunerbird 13 ? This is a really annoying bug...
Comment 32 patches 2012-06-05 06:54:58 PDT
Hi all,

Sorry for the delay, we are working on the patch to take all the remarks and submit a new version of the patch.

(In reply to Ludovic Hirlimann [:Usul] from comment #28)
> Comment on attachment 624331 [details] [diff] [review]
> Patch to fixe mimetype corruptiion.
> 
> I don't know if this testable - but I always love patches with tests :-)

What kind of tests are you expecting ? For our test we are making a simple tests which is trying to open a mail file with a corrupted attachment and check if now attachment can be open.

Regards
Comment 33 Ploué 2012-06-05 08:24:07 PDT
Hello,

Thank you for your work.
Another test to do is to forward a mail file with a corrupted attachment and check if attachment is good in the sent mail.

Sorry for my english -- Regards -- Serge
Comment 34 vincent.lucas 2012-06-06 01:00:46 PDT
Hi!
To eliminate bugs step by step, you can follow the diagram, to test all possibilities.
attachment 595021 [details]
Comment 35 WADA 2012-06-06 20:55:26 PDT
(In reply to David :Bienvenu from comment #29)
> Comment on attachment 624331 [details] [diff] [review]
> Patch to fixe mimetype corruptiion.
> +	if (StringBeginsWith(mSContentType, NS_LITERAL_STRING("=?")))
> +		PRInt32 pos = mSContentType.Find("?q?");

Is this bug actually problem due to starting "=?" and mid "?q?"?
As I wrote in bug 739025(a summary of this kind of problems), cause looks unescaped "?" in mime-type, and similar problem is seen by unescaped "&" in mime-type. And further, different problem by "major mime type only==lack of  following slash and sub-type" is also observed.
Comment 36 Marek Jagielski 2012-06-29 08:05:39 PDT
(In reply to vincent.lucas from comment #23)
> Created attachment 595021 [details]
> Thunderbird Content/MIME Type conditionnal flow
> 
> Conditional flow of unattended behavior of thunderbird.

Hi Vicent,
 Have you detected the place of in the code where the bug "Create corrupted file ..." is ?
Comment 37 vincent.lucas 2012-07-04 03:58:55 PDT
(In reply to Marek Jagielski from comment #36)
> Hi Vicent,
>  Have you detected the place of in the code where the bug "Create corrupted
> file ..." is ?

I only tracked thunderbird executable, but without the source code.
It's when attachement is build on the temp directory.
I'll try compile source code with debug console output.
Comment 38 Marek Jagielski 2012-07-04 04:25:10 PDT
In the class implementation: nsMailboxService.cpp in method NS_IMETHODIMP nsMailboxService::OpenAttachment there is an URL constructed to download the attachement. Part of query string is the content type.

There is no validation of the URL and as a result the URL is like this:
?(...)&type==?windows-1252?q?application/pdf&filename=(...)

Any strings without reserved or forbidden characters are not causing the creation of the corrupted file.

I would propose to make an general validation of the URL that change any forbidden character to its "%" code as it is in rfc 1738.
Comment 39 Ludovic Hirlimann [:Usul] 2012-07-06 00:30:38 PDT
Mark, Neil comments on comment 38 ?
Comment 40 Marek Jagielski 2012-07-18 02:12:39 PDT
Created attachment 643296 [details] [diff] [review]
PATCH from version 13.0.1

This is the patch that encode correctly the parts of URL for the service POP and IMAP. As a result an attachement is downloaded correctly (content) even if the type is wrong.
Comment 41 Kent James (:rkent) 2012-07-23 14:12:26 PDT
Created attachment 645064 [details] [diff] [review]
Correct contentType if invalid or generic

This patch is a more complete version of that in https://bugzilla.mozilla.org/attachment.cgi?id=624331, adding a unit test, and extending it to:

1) lookup the contentType based on the file extension
2) unit tests
3) also correct for generic contentType application/applefile as mentioned in bug 776246
Comment 42 Kent James (:rkent) 2012-07-23 14:19:56 PDT
I'm a little confused about the definition of this bug relative to bug 503309 and its relatives. I am interpreting this bug according to the summary, and not according to the STR in comment 15. That is, the patch allows you to successfully open files with invalid mimetypes if the correct mimetype can be inferred from the file extension, but does nothing to fix the original corruption.

As part of this, I am proposing that we also lookup that contentType based on the filename extension if the contentType is application/applefile as reported in bug 776246. This might be controversial, so feel free to comment on this aspect.

I would like to thank the members of the tb-enterprise list for pointing out to me the significance of this bug.
Comment 43 Kent James (:rkent) 2012-07-23 19:38:56 PDT
Reviewers should also see the cautions in Bug 293804 - Attachment with unknown MIME type but known extension: TB should not presume types are identical, which we are proposing to violate here.
Comment 44 patches 2012-07-24 01:59:06 PDT
Hi Kent,

Did you looked at Marek's Patch (https://bugzilla.mozilla.org/attachment.cgi?id=643296) ?

It actually fixes the underlying bug causing the bugs described in this report and other similar reports, such as Bug 580971 or Bug 503309, and is, IMHO, a better fix than the patch I've made and you enhanced.

These bugs are caused by "?" in content-type header, which is not allowed according to see RFC 2045 (BNF Grammar is described in Section 5)

Marek has detected that Thunderbird fails when it tries to open attachement with these corrupted content-type because it references atachements by URL and "?" is reserved character in URLs (see RFC 1738)

His patch changes reserved characters on content-type to their %-encoded charachters, so TB can open the correct URL and then get the correct attachment data, regardless the content-type's value.

That way, TB is always able to get correct attachment data and send it to the external application.
Comment 45 Kent James (:rkent) 2012-07-24 09:49:30 PDT
Yes I looked at his patch, but I guess I misunderstood its purpose. The definition of which bug is which in this series has been confusing to me, and I thought he was addressing a different issue. But let me look at it again.

Regardless, I intend to drive this to a conclusion.
Comment 46 :Irving Reid (No longer working on Firefox) 2012-07-24 12:27:21 PDT
I'm reluctant to put either Marek or Kent's patch forward for this specific issue - I'd prefer to see the bad content-type headers trimmed as soon as possible (at message header parse time, probably) as David proposed in https://bugzilla.mozilla.org/show_bug.cgi?id=659355#c29. That way the bad content-type wouldn't spread through any other parts of TB.

Not sure if we store parsed Content-Type headers in gloda or the .msf file; if so, we'd need to rebuild or make a cleanup pass to get rid of corrupt values stored in the DB.

The application/applefile issue I think needs separate handling, hopefully not in the form of a special case; the issues are similar to those for application/octet-stream.
Comment 47 Kent James (:rkent) 2012-07-24 12:58:57 PDT
(In reply to Irving Reid (:irving) from comment #46)
> I'm reluctant to put either Marek or Kent's patch forward for this specific
> issue - I'd prefer to see the bad content-type headers trimmed as soon as
> possible (at message header parse time, probably) as David proposed in
> https://bugzilla.mozilla.org/show_bug.cgi?id=659355#c29.

Comment 29 was reviewing the patch that my latest patch was correcting, so you could view the patch I submitted as a response to comment 29, also including unit tests. So I really don't understand this comment.

On a broader scale though, let's parse the issues:

1) What errors in contentType do we fix
2) What do we replace them with
3) Where do we apply the fix
4) Prepare a patch, ideally with tests, that works within the mozilla framework.

I' happy to do part 4) though that robs the other authors of the opportunities to learn the quirks of the mozilla platform.

The choices:
A) https://bugzilla.mozilla.org/attachment.cgi?id=624331 (patches@portaildulibre.fr)
1a) begins with "=?"
2a) delete bad characters from url of form =?...?q?application/pdf
3a) nsMimeHtmlEmitter.cpp

B) https://bugzilla.mozilla.org/attachment.cgi?id=643296 (Marek Jagielski)
1b) illegal characters that cause an invalid URL
2b) replace illegal characters with escaped equivalents
3b) instances of nsIMsgMessageService.openAttachment

C) https://bugzilla.mozilla.org/attachment.cgi?id=645064 (kent@caspia.com)
1c) content type beginning with "=" or application/applefile
2c) guessed version from file extension
3c) nsMimeHtmlEmitter.cpp

jcranmer also discussed on IRC that he might prefer that 3) be implemented even further upstream in the C part of libmime. Perhaps that is what Irving is suggesting.

My vote of course is C - though it would make sense to expand my 1c) to include any of the bad characters that Marek detects.

But I am open to other opinions. But please let's push this to a conclusion, and not let the best be the enemy of the good.
Comment 48 Joshua Cranmer [:jcranmer] 2012-07-24 13:48:25 PDT
Before I review this patch, there are some questions I want to see answered:
1. Why is this only being fixed for the HTML emitter, and not the other emitters?
2. How does this affect gloda/TB conversations? They'll see what gets passed from mimemoz2.cpp to the emitter.
3. Why is .pdf -> application/pdf being specifically special-cased? Any special casing like that really, really, really scares me.
4. Did you look into seeing what happens if you do this processing in mimei.cpp's mime_create? Where you have it now, it feels like you're just papering over a bandaid. Placing it in mimei should give us a fix that makes thing "just work" everywhere.
Comment 49 Kent James (:rkent) 2012-07-24 13:58:16 PDT
(In reply to Joshua Cranmer [:jcranmer] from comment #48)
> Before I review this patch, there are some questions I want to see answered:
I'm going to give simple answers without defending them.

> 1. Why is this only being fixed for the HTML emitter, and not the other
> emitters?

I believe that I recall Bienvenu suggested that in another patch, but I don't know which one.

> 2. How does this affect gloda/TB conversations? They'll see what gets passed
> from mimemoz2.cpp to the emitter.

I don't know

> 3. Why is .pdf -> application/pdf being specifically special-cased? Any
> special casing like that really, really, really scares me.

Because that has been the specific issue that is the current real user pain, yet was not in the default extension->contentType mapping as I understood it.

What would you suggest instead, if the minimum goal is to fix the current pain point? The other patches assumed that the obviously incorrect contentType was in fact mangled according to some well-understood rules, which makes me more nervous than guessing based on extension.

> 4. Did you look into seeing what happens if you do this processing in
> mimei.cpp's mime_create? Where you have it now, it feels like you're just
> papering over a bandaid. Placing it in mimei should give us a fix that makes
> thing "just work" everywhere.

I'm happy to move this upstream if you believe it is best.

Joshua, it would be helpful if your questions provided suggested answers or objections to the generic questions and categories in my comment 47.
Comment 50 Kent James (:rkent) 2012-07-24 15:51:17 PDT
Created attachment 645553 [details] [diff] [review]
fix in mimei.cpp

Alternate fix (one line of code! plus test)

Fix parse:

1d) content type beginning with "="
2d) guessed version from file extension
3d) mimei.cpp
Comment 51 WADA 2012-07-24 16:27:15 PDT
(In reply to Kent James (:rkent) from comment #50)
> fix in mimei.cpp
> Alternate fix (one line of code! plus test)
> Fix parse:
> 1d) content type beginning with "="
>(snip)

As I wrote in bug 712595, it's not starting "=" only issue.
(a) Any of "?", "&", "=" in mime-type value needs escaped because they are delimiter or separater in internal URL used by Tb.
Further, 
(b) Tb fails to access part pointed by the internal URL, if major mime-type only without trailing "/". 
(c) For filename, if filename contains special chars but is not quoted by " or ', Tb fails to access part pointed by the internal URL. filename value also needs escaped or quoted in internal URL.
(b) & (c) is perhaps different issue, but "URL construction & URL interpretation is not torelant with malformed header or not-so-happy-for-Tb header" is common.
Comment 52 Kent James (:rkent) 2012-07-25 10:32:12 PDT
Let's at least agree that this bug is about opening emails that have some subset of syntactically incorrect content-type values. (So I've dropped my attempt at adding application/applefile issues to this).

Regarding comment 51, I interpret this request, in the nomenclature of comment 47, as:

1e) Any of "?", "&", "=" in content-type value
2e) Replace with escaped version.

I have no problem with 1e) - though you could argue I suppose that the entire list of tspecials from RFC 2045 section 5.1 be included.

Re 2e) I still think that the correct solution is to default to guessing based on the filename extension. There is code to do that already in mimei.cpp, plus the assumption here is that the content-type is messed up in ways that we can't predict. In the current example, surely we don't want:

contentType=%3Fwindows-1252%3Fq%3Fapplication/pdf

I would really appreciate it if those who are going to have strong opinions on the final form of this present their agreement  before I do another patch to throw away
Comment 53 Kent James (:rkent) 2012-07-25 12:49:26 PDT
Created attachment 645858 [details] [diff] [review]
Rev C - truncate bad stuff if '?' detected

After much discussion in IRC with :rkent :jcranmer and :irving, we could develop a consensus on the following approach to this:

1) detect (invalid) presence of '?' in content-type
2) fix by cutting everything before and including the last '?'
3) do it in mimei.cpp

This patch implements this. If accepted, then any other issues would need to be done as separate bugs.
Comment 54 WADA 2012-07-25 19:12:30 PDT
(In reply to Kent James (:rkent) from comment #53)
> Rev C - truncate bad stuff if '?' detected
> 1) detect (invalid) presence of '?' in content-type
> 2) fix by cutting everything before and including the last '?'
> 3) do it in mimei.cpp

Does it mean that bogus entry in mimeTypes.rdf will be prevented if malformed  mime-type like "=?windows-1252?q?application/pdf"?
(RFC2047 is wongly applied. string after last ? is correct mime-type.)

If so, I agree on the solution, because I don't know bug report for actual case other than wrongly-applied-RFC2047-encoding, because &part=1.2 portion in URL won't be missed, and because right mime-type can be extracted in such case. 
Bug 712595, which includes "&" case, is intentional test result to know what happens if &part=1.2 portion in internal URL is missed due to bad mime-type string such as "?", "&", "=" or other reasons.
Comment 55 Kent James (:rkent) 2012-07-25 21:24:21 PDT
This means that if content-type enters libmime (which is the main mime parser) with the value of "=?windows-1252?q?application/pdf" it will exit with the value of "application/rdf".  That should also prevent new bogus entries of this type in mimeTypes.rdf.

However, existing clients already have bogus entries, plus there are other ways that the content type can be malformed and are currently spread virally by Thunderbird. That also needs addressing (not sure which bug number to do that under yet) and will be the next project.
Comment 56 Kent James (:rkent) 2012-07-25 21:27:25 PDT
(In reply to Kent James (:rkent) from comment #55)
> ... it will exit
> with the value of "application/rdf". 

Oops I meant "application/pdf"
Comment 57 WADA 2012-07-25 21:41:51 PDT
*** Bug 580971 has been marked as a duplicate of this bug. ***
Comment 58 WADA 2012-07-25 21:42:59 PDT
*** Bug 685112 has been marked as a duplicate of this bug. ***
Comment 59 WADA 2012-07-25 21:51:10 PDT
*** Bug 738284 has been marked as a duplicate of this bug. ***
Comment 60 LIVINE Christin 2012-07-26 16:24:36 PDT
Question.
When sending with Thunderbird, why email with pdf (or something else) can be corrupted ? Email sending depends of mimeTypes.rdf ?
Note : In my company, in thunderbird, email sending format is UTF-8.

Suggestion. But maybe it was already discussed.
for recipient, if a problem in the email with pdf is detected,
- force pdf to be opened with external program, 
- or message could be corrected, but personnaly, I don't like this.
- displays a warning saying this email is corrupted and tell to the sender.
Comment 61 WADA 2012-07-26 16:59:15 PDT
(In reply to LIVINE Christin from comment #60)
> Question. (snip)

Read Bug 503309 and bugs listed in dependency tree for that bug well, please.
Comment 62 Marek Jagielski 2012-07-27 04:49:40 PDT
I would like to explain my correction. 

The final error is that we can't open the file from an infected mail. And the file we can't open because it wasn't well downloaded on the drvie. It wasn't well downloaded because of the machanism of internal URL. Why? Because URL is wrongly formated. And in this moment I asked about the format validation of the URL. So I proposed to replace forbidden values with their equivalents. - wthout any intelligence or guessing. Only technical operation of safe programming. My correction does not exlude others. It is at low level of abstraction and doesn't depend of the pdf problem from today or maybe jpeg from tomorrow.
Comment 63 Kent James (:rkent) 2012-07-27 07:58:42 PDT
Marek:

Did I correctly state you proposal in comment 47 (see 1b and 2b)?

I understand your answer to "2) What do we replace them with?" to be "2b) replace illegal characters with escaped equivalents"

Although that might be the correct answer if we were dealing here with content-type values that were, in fact, correctly set originally but just misinterpreted by us, I think we are dealing with content-type values that we all agree are incorrectly written by their originators. Taking "=?windows-1252?q?application/pdf" and replacing the invalid characters with their escaped equivalents may allow the URL to be opened in Thunderbird, but it still yields a content-type that is invalid. The invalid content-type then gets stored, and passed on to others as though it were valid.

There is no perfect or obviously correct answer here, but I think we are trying to deal with the reality that we are seeing a certain class of badly formatted content-type values that we can correct and restore to their original intent through, as you correctly state, "guessing" what that original intent was. This guessing gives a superior result in this case to escaping the invalid characters, though it is less robust than your approach in the sense that there are possible invalid content-type values that we could see in the future that would also make us fail, but would not fail with your approach.

If you still feel strongly about your approach, I would encourage you to try to have an IRC conversation on #maildev with :jcranmer and :irving as they seem to be serving as the MIME authorities for mailnews at the moment.
Comment 64 Marek Jagielski 2012-07-27 08:23:20 PDT
In fact I followed the analyse of Vincent's attachment "Thunderbird Content/MIME Type conditionnal flow". I strongly agree with his cutting the problem into two bugs. I didn't make a correction of mimeType but of URL (In flow diagram is the bug on bottom). I make this test even for a URL part that use the file name. 

I didn't propose solution for second bug (due to flow). It is why I wrote that my correction doesn't exclude the other.
Comment 65 Kent James (:rkent) 2012-07-27 08:53:36 PDT
As I understand vincent's analysis from comment 24, his second bug is:

"Use of an extension based on the file extension and then correctly extract the attachment" This is,  as I understand it, my variant 2c) from comment 47 (though some of my patches have incorrectly used a guess based on mimeType.rdf, when I intended to use the hard-wired default values).

Is that your proposal? It would help a lot if you would frame this discussion in the terms that I proposed in comment 47.
Comment 66 Marek Jagielski 2012-07-27 09:10:04 PDT
I ment the analyse as a problem detection not the solution proposal. Because for me it is the bug when the mail having the correct attachment inside in term of integrity can't be saved on the drive whatever the type would be - even as binary data. The reason of that is much trivial than lack of information about a real mime type. It is only the internal mechanism of URL that crashes.
Comment 67 Marek Jagielski 2012-07-27 09:15:57 PDT
(In reply to Marek Jagielski from comment #66)
Sorry English, I wanted to say that the attachment can't be saved (in place of mail).
Comment 68 Kent James (:rkent) 2012-07-27 09:26:03 PDT
(In reply to Marek Jagielski from comment #66)
> I ment the analyse as a problem detection not the solution proposal. Because
> for me it is the bug when the mail having the correct attachment inside in
> term of integrity can't be saved on the drive whatever the type would be -
> even as binary data. The reason of that is much trivial than lack of
> information about a real mime type. It is only the internal mechanism of URL
> that crashes.

I don't think that is this bug, though I admit there is some confusion about the actual definition of this bug. I am trying to fix "Cannot open PDF attachment in some mails (Content-Type: =?windows-1252?q?application/pdf)"

If you want to propose that we accept invalid content types by escaping them so that URL works, I suggest that you do that in a separate bug.

I suggest to you, *again*, that if you want to further discuss this bug, that you do so in the terms of comment 47, or propose your own carefully defined categories. I don't really understand what you are objecting to in the current proposal beyond what I have already stated in comment 65
Comment 69 :Irving Reid (No longer working on Firefox) 2012-07-27 10:13:05 PDT
Content-Type: has a very limited set of valid content in practice. In particular, = ? & are all not allowed. Marek's proposed patch does prevent Thunderbird's URL infrastructure from failing on those characters, but I'd prefer that we reject them as invalid in the parser rather than let them through and then try to deal with the subsequent issues.

However, since the vast majority of problems we have with content-type have a particular format (the =?charset...? prefix), we can do a bit better than just discarding the content-type as invalid. In all the real-world cases I've seen, truncating the prefix off the C-T results in usable data. If we truncate and the result is still unknown or invalid, then it makes sense to fall back on using the file extension to figure out what the default should be.

The end result is the same - we won't pass bad characters into the Content-Type field in the URL builder, so the attachment will be saved and displayed correctly.
Comment 70 WADA 2012-07-28 01:12:03 PDT
(In reply to Kent James (:rkent) from comment #68)
> I am trying to fix "Cannot open PDF attachment in some mails (Content-Type: =?windows-1252?q?application/pdf)"

How can we know that this bug is for "valid mime-type is wrongly rfc2047 encoded" case only?
What is reason not to include solution in this bug for case such as absolutely-broken "& after last ? in mime-type" which is very easily guessed or derived from phenomnon with mime-type of =?windows-1252?q?application/pdf?
We are better to implement "escaping" and/or "reject them as invalid in the parser" by Irving Reid in separated bug such as bug 739025, complying with your "If accepted, then any other issues would need to be done as separate bugs" in comment #53?

(In reply to Irving Reid (:irving) from comment #69)
> Marek's proposed patch does prevent Thunderbird's URL infrastructure from
> failing on those characters, but I'd prefer that we reject them as invalid
> in the parser rather than let them through and then try to deal with the subsequent issues.

Tb's failure in this bug(and some bugs) is "Tb misses &part=1.2 portion in iternal URL if prohibited char is placed in &type=<mime-type> portion etc.. If Tb misses &part=1.2 portion, Tb currently uses part=1.1(=message body, first text/plain or text/html in multipart/mixed) in such case.
This is phenomenon of "Cannot open PDF attachment" in this bug(and duped bugs).

When malformed or broken mime-type, will "Rejecting in the parser" produce URL like ...&type=Unknown... or URL without &type?
If yes, there is no need to add "escaping of mime-type for internal URL" any more.

Can both "discarding chars before last ?" for "valid mime-type is wrongly rfc2047 encoded" case and "Rejecting in the parser if broken mime-type" exist at same time?
If it's possible, will "false valid mime-type in &type=... of URL due to substring of broken mime-type" surely be avoided?
As for "accessing a subpart from &part=M.N portion of URL", "false valid mime-type in &type=... of URL" won't produce any problem, as far as "part validity check" won't be executed based on &type... nor &name=... of URL. Hoewver, when "Open" or "Save", if string in &type=... is shown at Open or Save dialog, "false valid mime-type in &type=..." will produce user's confusion and/or misleading of user.
If "broken mime-type in message header" is shown to user as-is at Open/Save dialog, such needless confusion or misleading is avoided.
Comment 71 Kent James 2012-07-28 11:11:41 PDT
(In reply to WADA from comment #70)
> (In reply to Kent James (:rkent) from comment #68)
> > I am trying to fix "Cannot open PDF attachment in some mails (Content-Type: =?windows-1252?q?application/pdf)"
> 
> How can we know that this bug is for "valid mime-type is wrongly rfc2047
> encoded" case only?

This is not a question of "knowing", it is a question of definition of the current bug. I think we know pretty clearly what is happening, the question is the scope that we will fix in this particular bug.

> What is reason not to include solution in this bug for case such as
> absolutely-broken "& after last ? in mime-type" which is very easily guessed
> or derived from phenomnon with mime-type of =?windows-1252?q?application/pdf?

As I understand "& after last ? in mime-type" you are proposing that we try to fix something that looks like this:

content-type:=?windows-1252?q?&application/pdf

or is it

=?windows-1252?q?application/pdf&foo=bar

If these are common issues, then I respectfully suggest that we will get them fixed more quickly if we quit arguing over this bug, get the patch landed that we all agree will solve an important common problem, and then consider other issues in followup bugs.

> We are better to implement "escaping" and/or "reject them as invalid in the
> parser" by Irving Reid in separated bug such as bug 739025, complying with
> your "If accepted, then any other issues would need to be done as separate
> bugs" in comment #53?
>

Yes I really think it is better if these other issues be handled separately.
Comment 72 WADA 2012-07-28 17:09:50 PDT
(In reply to Kent James from comment #71)
> As I understand "& after last ? in mime-type" you are proposing that we try
> to fix something that looks like this:
> content-type:=?windows-1252?q?&application/pdf (snip)

Not such one. application&pdf like one or !#$%...<>?_...&=~|ABCDEFG... like one which is sadly generated by mail application programer who easily/stupidly produced mime-type of =?windows-1252?q?application/pdf by incorrect applying of RFC2047 encoding. Such programmer may quickly type 38(decimal of &) for 47(decimal of /) :-)

In such case, even after your valid solution for this bug, Tb saves data of wrong sub part(message body) even when user requests "Save" instead of "Open", as done by current Tb.
Even if any kind of broken mime-type is written in message header, I believe Tb should save correct part at least by "Save".
This is request of a protection of Tb user from malformed mail. I keep bug 739025 open for such edge case and some other actual cases.
Comment 73 Joe Sabash [:JoeS1] 2012-07-29 08:22:04 PDT
I guess SeaMonkey has a common interest here. added cc's
Comment 74 Joe Sabash [:JoeS1] 2012-07-29 08:37:57 PDT
Withdrawing checkin-needed til we get input from SeaMonkey folks.
Comment 75 Kent James (:rkent) 2012-07-29 13:06:34 PDT
Joe: I can do my own checkins, so please let me manage that.

WADA: "This is request of a protection of Tb user from malformed mail. I keep bug 739025 open for such edge case and some other actual cases."

This is the next issue that I want to work on, to stop invalid content types from occurring in outgoing messages. Is that the best bug for that?
Comment 76 WADA 2012-07-29 17:40:04 PDT
(In reply to Kent James (:rkent) from comment #75)
> This is the next issue that I want to work on, to stop invalid content types
> from occurring in outgoing messages. Is that the best bug for that?

For this kind of bad mime-type in outgoing message from Tb, following is better.
 bug 503309 : receives bad mime-type, puts it in mimeTypes.rdf, then Tb sends it
 bug 738284 : this bug's problem which is produced by mail sent from Tb
              (I dup'ed to this bug because major symptom was same as this bug)
Comment 77 :Irving Reid (No longer working on Firefox) 2012-07-30 08:35:24 PDT
Wada, I understand your concern about protecting Thunderbird from other forms of corruption in the Content-Type field in incoming messages. I agree with the idea of keeping bug 739025 to track a solution to that problem.

Bug 503309 covers the case where we receive a bad Content-Type, store it, and then use it later to send other attachments.

I've been looking all over the place for the last week to see if I can identify where the bad Content-Type fields are coming from in the first place. Based on some Google searching, I can't find any mentions involving mail clients *other* than Thunderbird, so I'm operating under the assumption that it's our fault. If anyone has an example message with a bad C-T that was sent from a client other than Thunderbird, I'd love to see it.

I read through most of the mail compose code related to attachments, and I don't see a place where that code puts the Content-Type header through RFC 2047 encoding, though I have a couple of ideas for manual tests to explore that further if someone else wants to dig in.

I'm currently looking into the Exchange PST file import module, bug 645994
Comment 78 AMI 2012-07-30 09:02:52 PDT
Hello,

My company is concerned by this problem, we correct it by replacing the mimeTypes.rdf file, but from the reception of an outside email with an attachment pdf, the file is corrupted and the user with again a mimeTypes.rdf file with the line "=?windows-1252?q?application/pdf"?, what propagates again the problem on local profile users. 

We use at present the solution proposed by Mohican (comment 20): the extension OpenAttachmentByExtension but this solution is not stable. As indicated previously, in the next update of Adobe, we shall meet again the problem. 


Is there a solution? Or this problem will be solved in the last version of thunderbird?
Comment 79 Kent James (:rkent) 2012-07-30 09:48:50 PDT
AMI: As I understand it, the reviewed patch will have the same effect as Marek's patch for viewing emails with the specific content type error of this bug, it just has a different place it is applied and a different effect on other types of invalid content-type(s) (Please correct me Marek if I am incorrect).

But that is not the end of the story, and I think that is what you are saying. There are at least two other critical issues related to this that I hope to also take on soon. Those issues, and my current understanding of the bug for them, is:

1) preventing bad content types from getting into mimeTypes.rdf in the first place (that is bug 503309)

2) not using bad content types that already exist in mimeTypes.rdf in outgoing emails (that is currently bug 776246 though I suspect there is a better earlier bug).

I think that the issue you are discussing is bug 503309. This is an important issue to me, and I intend to drive this forward to ensure that it is landed by Thunderbird 17 (which by the way is NOT the "last version of Thunderbird").

As a slightly OT aside, a major concern of mine as I view the overall Thunderbird project is that bugs like this are not reliably given the priority they deserve. This bug is full of expressions of concern by "New to Bugzilla" posters, some of whom even spent their time to develop and submit patches, yet it has been a very slow and uncertain process to get core developers interested in this (I am a volunteer by the way, and not a paid core developer though I'm experienced enough to have checkin privileges).

This is not the correct forum to discuss this issue (and I am probably violating BMO convention by bringing it up at all here), but I would love to hear your frustrations and possible solutions to this privately to me at kent@caspia.com
Comment 80 Kent James (:rkent) 2012-08-01 11:45:15 PDT
Created attachment 648027 [details] [diff] [review]
unbitrotted patch landed

Checked in http://hg.mozilla.org/comm-central/rev/7675a02ee3aa
Comment 81 WADA 2012-08-01 15:22:24 PDT
(In reply to Irving Reid (:irving) from comment #77)
> Based on some Google searching, I can't find any mentions involving mail clients
> *other* than Thunderbird, so I'm operating under the assumption that it's our fault.

As I wrote in bug 580971 comment #11, User-Agent in bug reports at B.M.O was always Thunderbird. But, as I wrote in bug 580971 comment #14, at least one Web site sends HTTP header of "Content-Type: =?iso-8859-1?q?application/pdf" for PDF file. 
So, I believe phenomenon we are looking is following.
(1-a) At Web site, =?iso-8859-1?q?application/pdf is put in .htaccess,
      and Web Mail or Web application sends =?iso-8859-1?q?application/pdf
      to mail recipient.
(1-b) At Web site, stupid Web application applies RFC2047 encoding to mime-type
      and puts it in Content-Type: header, and sends it to mail recipient.
(2) Tb has capability to deliver corrupted mime-type to many peoples
    pretty easiliy, silently, like virus program :-)
(1-a)/(1-b) is never Tb's fault. Fault in Tb is in (2) only.

By the way, read Bug 685112 for "what is shown by Tb if mime-type like =?iso-8859-1?q?application/pdf", please(Opener of Bug 685112 is first reporter about it.) You can see "what Tb does do when ?, &, = etc. is not escaped in internal URL".
Comment 82 WADA 2012-08-02 18:29:48 PDT
(In addition to comment #81)

As I wrote in bug 580971 comment #15, Firefox can deliver =?iso-8859-1?q?application/pdf to a Web site or a Web mail once Firefox recognized this kind of broken mime-type.

Because Firefox can Download PDF file of "Content-Type: =?iso-8859-1?q?application/pdf" and can save PDF attachment of "Content-Type: =?iso-8859-1?q?application/pdf" in mail via Web Mail, and because "silent creation of such entry in mimeTypes.rdf" was initially Firefox's feature and was ported to Thunderbird, Firefox can send HTTP header of "Content-Type: =?iso-8859-1?q?application/pdf" upon PDF file upload and can force Message header of "Content-Type: =?iso-8859-1?q?application/pdf" in a Web mail.
Comment 83 Joe Sabash [:JoeS1] 2012-08-02 19:07:11 PDT
Tested with a good pdf and the following bad c-t
Content-Type:=?windows-1252?q?&application/pdf
 name="S006631414_8192204.pdf"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
 filename="S006631414_8192204.pdf"
Mozilla/5.0 (Windows NT 5.1; rv:17.0) Gecko/17.0 Thunderbird/17.0a1
20120802030533
Comment 84 Joe Sabash [:JoeS1] 2012-08-02 19:59:20 PDT
(In reply to WADA from comment #82)
> (In addition to comment #81)
> 
> As I wrote in bug 580971 comment #15, Firefox can deliver
> =?iso-8859-1?q?application/pdf to a Web site or a Web mail once Firefox
> recognized this kind of broken mime-type.
> 
> Because Firefox can Download PDF file of "Content-Type:
> =?iso-8859-1?q?application/pdf" and can save PDF attachment of
> "Content-Type: =?iso-8859-1?q?application/pdf" in mail via Web Mail, and
> because "silent creation of such entry in mimeTypes.rdf" was initially
> Firefox's feature and was ported to Thunderbird, Firefox can send HTTP
> header of "Content-Type: =?iso-8859-1?q?application/pdf" upon PDF file
> upload and can force Message header of "Content-Type:
> =?iso-8859-1?q?application/pdf" in a Web mail.

This should be completely invisible to TB because of where Kent placed this patch.
The only content-type that tb sees is application/pdf
The garbage is stripped off.
Comment 85 martin.bodin 2012-10-30 07:14:22 PDT
> Thus, even if charset definitions were allowed, RFC 2047 Section 2 would
> require them to have a "=?" charset "?" encoding "?" encoded-text "?="
> syntax, implying "=?windows-1252?q?application?=/pdf;" at best.

Hi,
I’m not very used to Thunderbird’s internal source code, but from what I’ve read of your patch, you just remove up to the last question mark ‘?’.  But what if an (ill typed, undoubtely, but “more” correct than the “=?windows-1252?q?application/pdf”’s one) mime type was declared as rsx11m said, that is “=?windows-1252?q?application?=/pdf”.
Removing up to the last ‘?’ gives “=/pdf”, which is still not correct.

Of course, both of those mime types are incorrect, but I think it would be more consistent to first eliminate “=?”/“?=” pairs and then using this tricks of removing up to the last ‘?’.

As I’m new to Thunderbird’s code, I’m of course not sure of how best it is to proceed…

Best,
Martin.
Comment 86 Joshua Cranmer [:jcranmer] 2012-11-01 08:09:37 PDT
(In reply to martin.bodin from comment #85)
> > Thus, even if charset definitions were allowed, RFC 2047 Section 2 would
> > require them to have a "=?" charset "?" encoding "?" encoded-text "?="
> > syntax, implying "=?windows-1252?q?application?=/pdf;" at best.
> 
> Hi,
> I’m not very used to Thunderbird’s internal source code, but from what I’ve
> read of your patch, you just remove up to the last question mark ‘?’.  But
> what if an (ill typed, undoubtely, but “more” correct than the
> “=?windows-1252?q?application/pdf”’s one) mime type was declared as rsx11m
> said, that is “=?windows-1252?q?application?=/pdf”.
> Removing up to the last ‘?’ gives “=/pdf”, which is still not correct.

The Content-Type you mention has not been observed to exist in the wild. When fixing this bug, an explicit decision was made to limit the fix to only the kinds of Content-Types observed to exist in the wild and not preemptively attempt to fix other possible malformed ones.
Comment 87 Guenael SANCHEZ 2012-11-30 00:06:15 PST
Hello,

I can confirm the bug is fixed in TB17, thanks for your work !

Any ideas if this patch will be integrated in ESR version ? It would be great, since we use (like many others) TB ESR in our company, and the "I cannot open my PDF" problem is really annoying !

Best,
GS
Comment 88 Vincent (caméléon) 2012-11-30 00:16:45 PST
(In reply to Guenael SANCHEZ from comment #87)
> Any ideas if this patch will be integrated in ESR version ? 

Well, Thunderbird 17 is the new ESR release, so I suspect it won't be back ported to TB10 ESR...
Comment 89 Guenael SANCHEZ 2012-11-30 00:42:00 PST
(In reply to Vincent (caméléon) from comment #88)
> (In reply to Guenael SANCHEZ from comment #87)
> > Any ideas if this patch will be integrated in ESR version ? 
> 
> Well, Thunderbird 17 is the new ESR release

True ! I didn't noticed that ! Just tested it against buggy emails, and everything was good !
Comment 90 Jerome92 2013-06-25 01:49:22 PDT
Hi,
I'm using TB 17.0.6 on Windows 7 64bits.
I still have the pdf issue.
When I save the pdf and then try to open it, I'm getting this message in my PDF application:
"Error PDF structure 40: invalid reference table (xref).
Note: the format of source is not PDF"
Besides, I noticed this issue is linked to another one: my contact sent me 2 pdfs, and TB only retrieved one, that brings this error message.
Comment 91 WADA 2013-06-25 02:45:40 PDT
(In reply to Jerome92 from comment #90)
> I still have the pdf issue.

As wriien in bug summary, and as clearly written in comments about patch of this bug, this bug is for malformed mime-type of "rfc2047 encoding is incorrectly/stupidly applied", for example, =?windows-1252?q?application/pdf.
This bug is never "bug for generic pdf related problem".

Is you problem actually caused by mime-type like =?windows-1252?q?application/pdf in Content-Type: header in message source?
If no, it's not problem covered by this bug.
Read all bugs pointed in this bug well, and search bugzilla.mozilla.org well for already opened bugs for your problem, please.

Note You need to log in before you can comment on or make changes to this bug.