Closed Bug 142517 Opened 22 years ago Closed 18 years ago

corrupted pdf-attachments

Categories

(MailNews Core :: Attachments, defect)

defect
Not set
critical

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: sts, Assigned: mscott)

References

Details

(Keywords: dataloss)

Attachments

(4 files)

The mail client doesn´t deliver pdf - attachments correctly. When received, the
attachment can be opened, but not viewed. The error reported by acrobat reader,
v. 5.0 is: unknown token '4Vjj\
Please always include build ID in bug-reports.

How are you trying to open the attachment?
Is it corrupted in the sense that it can not be saved and opened from disk either?
Could be a dupe of bug 133638 no ?
Depends on: 133638
gabriel: I don't think it has anything to do with that bug. That's a mimetype
Linux bug.

Also: I just tested sending a pdf attachment to myself, and the file opened
without problem both when i double-clicked it and used the "Open" in context
menu. (It opens IN mailnews, but that's another bug).  2002050504, XP, Acrobat 5.0.1
Reporter: In addition, please also indicate whether you use the Acrobat plugin,
or are using it as a helper application.

(Click link to bug near top of bugmail and use form for "Additional Comments" on
web in order to reply)
I tried to open the attachment with Acrobat Reader 5.0 as well as with Acrobat
Exchange 3.01. The file was created with Acrobat Exchange. Sent with MS Outlook,
the file could be opened without any problems. 
*** Bug 144682 has been marked as a duplicate of this bug. ***
*** Bug 148667 has been marked as a duplicate of this bug. ***
resolving as new based on dups
Severity: normal → major
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: dataloss
sending pdf attachments with Mozilla builds
Mozilla 1.0
Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.0.0) Gecko/20020530
and
Mozilla 1.1a
Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.1a) Gecko/20020611 

and my PC system W2000 fully updated at Windows site 
results in corrupted pdf.

Sending pdf in OS/2 Warp build 1.0 rc3 has no troubles.

PDFs trying to read with Acrobat Reader 5.05 gives error 
'There was an error opening this document. The file is damaged and could not be
repaired'.
and using Ghostview 7.04 for Windows gives:
GSview 4.2 2002-02-07
Failed to load e:\ghostview704\gs7.04\bin\gsdll32.dll, error 126
Kan opgegeven module niet vinden.

AFPL Ghostscript 7.04 (2002-01-31)
Copyright (C) 2001 artofcode LLC, Benicia, CA.  All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Scanning PDF file
Error: /syntaxerror in readxref
Operand stack:

Execution stack:
   %interp_exit   .runexec2   --nostringval--   --nostringval--  
--nostringval--   2   %stopped_push   --nostringval--   --nostringval--   false
  1   %stopped_push   1   3   %oparray_pop   1   3   %oparray_pop   .runexec2  
--nostringval--   --nostringval--   --nostringval--   2   %stopped_push  
--nostringval--   --nostringval--   --nostringval--   --nostringval--  
--nostringval--   --nostringval--
Dictionary stack:
   --dict:1015/1123(ro)(G)--   --dict:0/20(G)--   --dict:75/200(L)--  
--dict:97/127(ro)(G)--   --dict:220/230(ro)(G)--   --dict:14/15(L)--
Current allocation mode is local

Add also to Platform=MacOS 9
Incoming PDF attachment and .jar file were typed as Stuffit Text documents.
Communicator 4.78 handled same message/attachments OK.
Attached file Original File
Hi,

I can reproduce this on my linux system Mozilla/5.0 (X11; U; Linux i686; en-US;

rv:1.0.0) Gecko/20020529. There is a slight difference between the file sizes:
-rw-r--r--    1 rnc	 man	     55114 Aug 13 15:10 Untitled1.pdf
-rw-r--r--    1 rnc	 man	     55113 Aug 13 15:13 Untitled1_Sent.pdf
I have attached both PDFs.
This file gets
 There was an error opening this document. The root object is missing or
invalid.
from Acrobat Reader 5.0 
and 
Error: /syntaxerror in readxref
Operand stack:

Execution stack:
   %interp_exit   .runexec2   --nostringval--	--nostringval--  
--nostringval--   2   %stopped_push   --nostringval--	--nostringval--  
--nostringval--   false   1   %stopped_push   1   3   %oparray_pop   1	 3  
%oparray_pop   --nostringval--	 --nostringval--   --nostringval--  
--nostringval--   --nostringval--   --nostringval--
Dictionary stack:
   --dict:1050/1123(ro)(G)--   --dict:0/20(G)--   --dict:72/200(L)--  
--dict:72/200(L)--   --dict:97/127(ro)(G)--   --dict:223/230(ro)(G)--  
--dict:14/15(L)--
Current allocation mode is local
ESP Ghostscript 7.05.4: Unrecoverable error, exit code 1
On second thoughts Im not sure gabriel@pixle.demon.co.uk wasn't on the right
track with this being a dup of bug
http://bugzilla.mozilla.org/show_bug.cgi?id=133638

I had PDF set as mime type pdf (yes doh!); changing it to 'application/pdf' and
everything works fine (in mozilla 1.1). 

Reading the help it says a mime type should be two words separated by a slash -
couldn't a simple parse be done on the text field for error checking?
QA Contact: trix → yulian
QA Contact: yulian → stephend
By the definitions on <http://bugzilla.mozilla.org/bug_status.html#severity> and
<http://bugzilla.mozilla.org/enter_bug.cgi?format=guided>, crashing and dataloss
bugs are of critical or possibly higher severity.  Only changing open bugs to
minimize unnecessary spam.  Keywords to trigger this would be crash, topcrash,
topcrash+, zt4newcrash, dataloss.
Severity: major → critical
Report #13 is correct.  PDF files get corrupted if you have "pdf" in
preferences-> Helper applications instead of "application/pdf".  Having both
"application/pdf" and "pdf" (in that order) still generates the same problem. 
After removal of "pdf" the problem is gone. 

I had this "bug" in 1.21 on LINUX mandrake 9.0.
Still present in Mozilla 1.4. Can be by-passed removing the octet/stream plugin
in preferences>plugins.
I can only reproduce it if step 15 is followed otherwise sending PDFs works fine.
I expirienced the same problem (I couldn't open PDF attachments in mail anymore)
and solved it by removing all Helper applications entries. I was using

Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.6) Gecko/20040113
MultiZilla/1.6.0.1g

Unfortunately I don't know which of the entries (I had about six) was the source
of the problem.
*** Bug 210056 has been marked as a duplicate of this bug. ***
*** Bug 247343 has been marked as a duplicate of this bug. ***
*** Bug 189248 has been marked as a duplicate of this bug. ***
*** Bug 217165 has been marked as a duplicate of this bug. ***
Somehow, improper entries are ending up in mimetypes.rdf.  Does anyone know how
this could happen?

I've had a few users experience the problem, one repeatedly, but none of them
remember doing anything to change the setting.

Also, I think we should change this to a bug about mimetypes.rdf, not pdf files.
 I've seen the same problem with other file types, such as WordPerfect.


More detailed notes:

 * In the cases I've seen, it's caused by an improper entry in mimetypes.rdf. 
When we fix mimetypes.rdf, it fixes the problem.

 * The problem occurs when the file is attached/encoded in the e-mail.  You can
see it in the source of the Sent folder.

 * The source of the e-mail will clearly show which attachments are corrupted. 
Here's a corrupted attachment (truncated):

-------------------------

Content-Type: *.octet-stream;
 name="rem.PDF"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline;
 filename="rem.PDF"

%PDF-1.2
1 0 obj
<<
/Type /Info
/Producer (MkPDF 1.02)
/CreationDate (D:20010917143029)
stream
=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FF=FE[=3DI=B1\=B2=BC=043=07>=10=
=8A=1F=EDr4=1F=FF=84=1F=DF=C2}=7FO=BF}=FF=D2O=FF=AF=FF=90=B6=FE=DF=EF=FF=D2=
=FF=FE=DF=FE=BF=FF=FF=DD=11G=7F=FF=DF=FF=FE=BF=FB=FF=BF=FF_=FF=FB=F2=DCl=10=

-------------------------


Same attachment, not corrupted (truncated):

-------------------------

Content-Type: application/pdf;
 name="rem.PDF"
Content-Transfer-Encoding: base64
Content-Disposition: inline;
 filename="rem.PDF"

JVBERi0xLjIKMSAwIG9iago8PAovVHlwZSAvSW5mbwovUHJvZHVjZXIgKE1rUERGIDEuMDIp
Ci9DcmVhdGlvbkRhdGUgKEQ6MjAwMTA5MTcxNDMwMjkpCi9DcmVhdG9yIChUb3NoaWJhIE1G
UCBDb250cm9sbGVyKQo+PgplbmRvYmoKMiAwIG9iago8PAovVHlwZSAvQ2F0YWxvZwovUGFn
ZXMgMyAwIFIKPj4KZW5kb2JqCjMgMCBvYmoKPDwKL1R5cGUgL1BhZ2VzCi9LaWRzIFsgNCAw

-------------------------


* One user has experienced the problme multiple times, and one time seems to
have undone the malfunctioning setting.  The user claimes to be unware of how he
does it.
*** Bug 230984 has been marked as a duplicate of this bug. ***
Product: MailNews → Core
I have this same or very similar problem with Mozilla 1.8a4.  When attempting to
open a pdf file received via email I get an error with acrobat reader 6.02:
   An unrecognized token ',token_value>' was found.

When I saved the file and tried to open it again I got the same error.  When I
tried Acrobat 5.0 I got the same error, but first 5.0 reported:
   There was an error processing a page.  Too few operands.

I tried opening pdf files received by email before I upgraded to 1.8a4 and
they also failed.  These had been opened successfully with v1.7.3.

Further, pdf files I had downloaded from the internet or ones I had created
myself had no problems or errors with either the reader or acrobat.

I solved my issue by re-installing Mozilla 1.7.3.


Dave Jeziorski
see also bug #276890
Assignee: mscott → mscott
OS: Windows 2000 → All
Hardware: PC → All
I've found a way to duplicate the bug, but in Mozilla Suite 1.3.  I
don't have a more recent version of Mozilla Suite installed
anywhere, and I testing in Firefox 1.0 didn't duplicate the bug
(which probably wouldn't affect Thunderbird anyway).

To duplicate in Moz Suite 1.3 on Win2K:

1)  Create a Netscape webmail account (very quick)

2)  Send a PDF file to it.

3)  In Netscape webmail, click the attachment, then click Save to
    Disk, then click Save in the Windows Save dialog box.

4)  Send an e-mail from Mozilla (not from webmail) with an
    attached PDF.  The attachment will be corrupted.

When you click Save in the Windows Save dialog in step 3,
mimetypes.rdf gets modified.  I'll upload before and after
examples of the file.  Replacing mimetypes.rdf with a good copy
(or just deleting it and letting Mozilla recreate it) fixes the
problem.

If you look at the source of step 4's message in the Sent folder
you'll see the content-type is wrong and encoding incorrect (per
comment #23, above).


I also tested Yahoo webmail. Yahoo modifies mimetypes.rdf at the
same point, but outgoing attachments aren't corrupted.

Finally, I tested it with a WordPerfect attachment in Netscape
(since that's my client's problem), and I duplicated the problem.


Other relevant bugs:  Bug 67940, bug 59631, bug 276890
I created this file by deleting the previous mimetypes.rdf and restarting Moz
1.3.  I copied it before I clicked Save in the Windows Save dialog (see
previous comment).
I copied this file immediately after clicking Save in the Windows Save dialog. 
Again, see above comment for details.
From comment 27:
> Finally, I tested it with a WordPerfect attachment in Netscape

I meant, I tested it in Netscape webmail, using Moz 1.3.
The corruption of the PDF file attached to comment 11 is a result of a missing
handling of mixed newlines in text file detection.

If the MIME type does not enforce base64, nsMsgAttachmentHandler::PickEncoding()
selects quoted-printable or 8bit if less than 10% of the characters are
considered unprintable.

But PDF files contain mixed CR, LF and CRLF in the binary sections which will be
all converted to CRLF in the mail. Saving the attachment later converts these to
the newline of the target system which is LF (Linux) in comment 12.

This issue is addressed in bug 269390 which has a patch.
This bug is possibly fixed by patch from bug 269390 comment 7 (attachment 200327 [details] [diff] [review]).
Steffen Schliesing, Nick Cross: can you confirm that the bug has been fixed?  

Since the difference in the files provided by Nick Cross involves a conversion of two bytes (x0D x0A) to one (0x0A), it seems like that the patch from bug 269390 (which is in Seamonkey 1.0 and TB 1.5) has taken care of that problem.

Note that another fix for corrupted PDF files has been more recently implemented: bug 308121 has been fixed by way of bug 317009.  This patch is 
in nightly builds for both the 1.8 branch and the trunk, and will be in the 
x.x.0.2 releases of those platforms.


The problems detailed by guanxi starting at comment 23 are not addressed by either of those bugs.  The details provided by comment 27, plus the examples of corrupted data in comment 23, imply that Netscape Webmail started the problem 
by incorrectly sending a PDF with some bogus MIME type ("*.octet-stream" ?), which was then accepted by Mozilla as the default MIME type for a PDF.  Annoying, but fixable (in the suite) by editing the settings in the Preferences for "Helper Applications."
I noticed it already counts the number of low unprintable characters in m_ctl_count, but that count never gets used elsewhere. Since none of those are international characters, is it pretty safe to assume that if any of these characters exist, then the file is binary enough to warrant base64 encoding? A better check would also look to see if it uses more than one type of CR/LF/CRLF ending, but it's an easy one-line semi-fix to just add add an "if(m_ctl_count>0) ...".
No response from reporter(s)  => WFM
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → WORKSFORME
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: