Closed Bug 1414459 Opened 7 years ago Closed 1 year ago

tar.gz email attachments are mangled when downloading from office365.com

Categories

(Core :: Networking, defect, P3)

52 Branch
defect

Tracking

()

RESOLVED FIXED

People

(Reporter: RossBoylan, Unassigned)

Details

(Whiteboard: [necko-triaged])

Attachments

(1 file)

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0 Build ID: 20171005074949 Steps to reproduce: Sent an email that had a .tar.gz file (produced by R) as an attachment to myself or another user @ucsf.edu. It is hosted at office365.com. Use firefox to log in to outlook for the web (OWA), find the message. The file shows as the correct size, 211KB. Click on the attachment and select download. Possibly related bugs: 610679 (refers specifically to double compression), 714805, 233407, 902503, 1269101, Actual results: The resulting downloaded file is 325KB, and is corrupt as far as R i or tar is concerned. Inspecting the file with unix file (in cygwin) says it is a gz file with lowest compression/highest speed. gunzip produces a file named xxx.tar, but file shows this as a gz file with maximum compression. I'm not sure if it's identical to the original attachment, but it is about the right length. If renamed to end .tar.gz the resulting file is useable. So the file that ends up on my disk is my original file with a second layer of gzip compression on top of it. The second layer of "compression", apart from making the file uninterpretable, makes it ~ 50% larger. Expected results: The file in the attachment, my original file, should have ended up on my disk. The correct behavior happens with either MS IE or Chrome. I have tried this on multiple physical systems, multiple versions of Windows (7 and 2012 R2), and multiple versions of FF (32 bit ESR or 64 bit current)--probably also multiple versions of the other browsers too. All systems are running 64 bit windows. I think I've seen the same problem with FF on Debian GNU/Linux 64 bit, but am not 100% sure. And, since the original problem happened when I sent the file to someone else, it's been tested under a different user (who used FF on Win7). A campus tech support person experienced the same problem with the same file--so at least 3 people experience the same problem. I do not know what protocols are being used when I hit download for the file attachment.
On the protocols used, I did the download again with the console open. It shows GET https://outlook.office.com/owa/service.svc/s/GetFileAttachment and the response headers include Cache-Control private Content-Dispositionattachment; filename*=UTF-8''LazarSim_0.9-1.tar.gz Content-Encoding gzip Content-Type application/gzip; authoritative=true; ... Strict-Transport-Security max-age=31536000; includeSubDomains Vary Accept-Encoding X-BEServer BY2PZ05MB1957 X-BackEnd-Begin 2017-11-04T01:30:48.056 X-BackEnd-End 2017-11-04T01:30:48.259 X-BackEndHttpStatus 200 X-CalculatedBETarget BYZPR05MB1957.namprd05.prod.outlook.com X-Content-Type-Options nosniff X-DiagInfo BZ2PR05MB1957 X-FEServer BZ2PR02CA0123 X-Firefox-Spdy h2 X-Frame-Options SAMEORIGIN X-OWA-Version 15.20.218.7 X-UA-Compatible IE=EmulateIE7 I've omitted other info that looked either irrelevant or possibly security-exposing. P.S. I would be nice if I could copy and paste from the log without having to manually restore the formatting.
Component: Untriaged → Networking
Product: Firefox → Core
Hi Ross, I don't have a office 365 hosted mail account (so I use outlook.live.com instead), sending an email from gmail. The mail includes an attachment, a tarball created by command |tar -czvf foo.tar.gz bar.txt|. Then I download the file and check the SHA256 digest. It shows the file and the original one are identical. We'd very like to fix this issue, so some more information is very helpful. Would you: - provide the log by following instructions on [1] - see if you can reproduce this on public systems, so we can look into this more easily. Thanks. [1] https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging
Flags: needinfo?(RossBoylan)
Status: UNCONFIRMED → RESOLVED
Closed: 7 years ago
Resolution: --- → INCOMPLETE
Flags: needinfo?(RossBoylan)
PLEASE REOPEN. Still seeing this in FF 60.0.2. Sorry, when I saw the bug was resolved I thought it meant it was fixed. I've attached the requested log file. As far as I can tell from the interactive debugging tools, the attachment is being requested with a get and the file type is x-gzip. The reported download size is the inflated size of 377KB rather than the original size of 247KB. Works fine in Chrome. Should I compress the log? I don't see instructions to so I have not.
See my previous comment. The file was too big to upload, so I did end up compressing it (using 7-zip).

I'm also getting this. Simple steps to reproduce are download
https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.8/hdf5-1.8.12/src/hdf5-1.8.12.tar.gz
via firefox and via wget and compare. The first ten lines of hexdump -C of each are

via FF:

00000000  1f 8b 08 00 00 00 00 00  00 03 6c bc 63 b0 b0 3d  |..........l.c..=|
00000010  cc 2e ba 6c db b6 6d af  67 d9 b6 6d db b6 6d db  |...l..m.g..m..m.|
00000020  b6 6d db b6 ed f3 7e 7b  9f 3d 67 ff 38 9d 3b 93  |.m....~{.=g.8.;.|
00000030  49 d2 de 33 e9 b4 69 ae  99 b4 f8 51 10 00 f5 24  |I..3..i....Q...$|
00000040  b1 8a 40 c0 d7 3c 3f 76  5b ed 89 3f 2d bf d4 e1  |..@..<?v[..?-...|
00000050  6d f0 eb cf 58 d4 0d c8  6e 5d bc 3c fb 9c ad 49  |m...X...n].<...I|
00000060  ee 6b 11 41 eb 36 75 51  c0 de 8d ff b4 b6 c7 0b  |.k.A.6uQ........|
00000070  63 93 0f dd fd da fb 7e  4c c5 25 4b 7b b3 7b dd  |c......~L.%K{.{.|
00000080  6c 1e 6c 15 80 bb 5a 4c  43 99 87 0e ec f9 28 b8  |l.l...ZLC.....(.|
00000090  02 fa 2e 5b b8 67 b5 ca  96 78 3b 1c f0 cf 23 43  |...[.g...x;...#C|

and via wget

00000000  1f 8b 08 00 b0 24 8e 52  02 03 ec 3c fd 6f db b8  |.....$.R...<.o..|
00000010  92 fd b5 fe 2b 88 b7 0f  d8 f4 1a 2b b1 13 77 bb  |....+......+..w.|
00000020  3d 3c e0 39 b6 93 78 d7  89 83 d8 6d af 8b 03 7c  |=<.9..x....m...||
00000030  b2 44 5b dc c8 a2 8e 94  e2 78 7f b8 bf fd 66 48  |.D[......x....fH|
00000040  4a a6 be 9c be 77 6d f1  6e 51 03 bb b5 66 86 33  |J....wm.nQ...f.3|
00000050  c3 e1 70 3e 28 3a 81 bf  ea b5 3b ce 5b a7 d3 3d  |..p>(:....;.[..=|
00000060  b9 71 1f e8 8a 85 f4 c5  17 fe 9c c2 e7 cd f9 39  |.q.............9|
00000070  fc db eb 9e ab 7f 4f 3b  9d b3 33 05 87 0f c0 de  |......O;..3.....|
00000080  bc e8 74 bb e7 67 e7 dd  ce f9 4f 80 ef 9c 75 7b  |..t..g....O...u{|
00000090  dd 17 e4 f4 c5 37 f8 a4  32 71 05 21 2f 68 cc 53  |.....7..2q.!/h.S|

My test was done on firefox developer edition 68.0b11. Also I found this bug from
https://superuser.com/questions/162170/files-downloaded-with-firefox-are-often-corrupt

(In reply to Jon from comment #5)

I'm also getting this. Simple steps to reproduce are download
https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.8/hdf5-1.8.12/src/hdf5-1.8.12.tar.gz
via firefox and via wget and compare. The first ten lines of hexdump -C of each are

via FF:

00000000  1f 8b 08 00 00 00 00 00  00 03 6c bc 63 b0 b0 3d  |..........l.c..=|
00000010  cc 2e ba 6c db b6 6d af  67 d9 b6 6d db b6 6d db  |...l..m.g..m..m.|
00000020  b6 6d db b6 ed f3 7e 7b  9f 3d 67 ff 38 9d 3b 93  |.m....~{.=g.8.;.|
00000030  49 d2 de 33 e9 b4 69 ae  99 b4 f8 51 10 00 f5 24  |I..3..i....Q...$|
00000040  b1 8a 40 c0 d7 3c 3f 76  5b ed 89 3f 2d bf d4 e1  |..@..<?v[..?-...|
00000050  6d f0 eb cf 58 d4 0d c8  6e 5d bc 3c fb 9c ad 49  |m...X...n].<...I|
00000060  ee 6b 11 41 eb 36 75 51  c0 de 8d ff b4 b6 c7 0b  |.k.A.6uQ........|
00000070  63 93 0f dd fd da fb 7e  4c c5 25 4b 7b b3 7b dd  |c......~L.%K{.{.|
00000080  6c 1e 6c 15 80 bb 5a 4c  43 99 87 0e ec f9 28 b8  |l.l...ZLC.....(.|
00000090  02 fa 2e 5b b8 67 b5 ca  96 78 3b 1c f0 cf 23 43  |...[.g...x;...#C|

and via wget

00000000  1f 8b 08 00 b0 24 8e 52  02 03 ec 3c fd 6f db b8  |.....$.R...<.o..|
00000010  92 fd b5 fe 2b 88 b7 0f  d8 f4 1a 2b b1 13 77 bb  |....+......+..w.|
00000020  3d 3c e0 39 b6 93 78 d7  89 83 d8 6d af 8b 03 7c  |=<.9..x....m...||
00000030  b2 44 5b dc c8 a2 8e 94  e2 78 7f b8 bf fd 66 48  |.D[......x....fH|
00000040  4a a6 be 9c be 77 6d f1  6e 51 03 bb b5 66 86 33  |J....wm.nQ...f.3|
00000050  c3 e1 70 3e 28 3a 81 bf  ea b5 3b ce 5b a7 d3 3d  |..p>(:....;.[..=|
00000060  b9 71 1f e8 8a 85 f4 c5  17 fe 9c c2 e7 cd f9 39  |.q.............9|
00000070  fc db eb 9e ab 7f 4f 3b  9d b3 33 05 87 0f c0 de  |......O;..3.....|
00000080  bc e8 74 bb e7 67 e7 dd  ce f9 4f 80 ef 9c 75 7b  |..t..g....O...u{|
00000090  dd 17 e4 f4 c5 37 f8 a4  32 71 05 21 2f 68 cc 53  |.....7..2q.!/h.S|

I can exactly reproduce this by downloading the given URL, thank you Jon!

Status: RESOLVED → REOPENED
Ever confirmed: true
Resolution: INCOMPLETE → ---

Glad to help :)

There is a hack introduced in bug 426273 for badly configured servers which return Content-Encoding: gzip as well as Content-Type: application/x-gzip when serving gz files. Unfortunately, in this case the headers are correct and the server returns tar.gz which is gzipped once again. So what happens here is that ClearBogusContentEncodingIfNeeded() clears out Content-Encoding header and we end up saving gzipped tar.gz on the disk.

Priority: -- → P3
Whiteboard: [necko-triaged]
Severity: normal → S3

I think this is already fixed.

Status: REOPENED → RESOLVED
Closed: 7 years ago1 year ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: