Open Bug 844038 Opened 12 years ago Updated 3 years ago

Filename detect after Content-Disposition decode as utf8 fail in Linux

Categories

(Firefox :: File Handling, defect)

x86_64
Linux
defect

Tracking

()

UNCONFIRMED

People

(Reporter: muzuiget, Unassigned)

Details

Attachments

(1 file)

I am using Linux, sometime I click a link to download a file, Firefox display a wrong filename. The web page render correctly, and hover the mouse on the link, status bar display filename also right. The filename si wrong in download dialog, and in file system after save it. But downlod the same file on Windows XP, Firefox for Windows don't have this problem, the filename is right. After google, Firefox for Mac also have this problem. Chrome the same as Firefox, alright on Windows, but wrong on Linux. --- After a deep test I found out what cause this problem, because the server return the http header Content-Disposition. I alway can reproduce this problem with the server hfs http://www.rejetto.com/hfs/ hfs is popular http file sharing server on Windows, it return utf-8 encoded index html, but unfortunately, when download a file, the filename in Content-Disposition is not utf-8 encoding but the system default, in my case, Simplified Chinese Windows XP, the default encoding is GB18030(or GB23 12, GBK, they are compatible). I run XP in virtual machine, the share a file with hfs * ip is 192.168.0.101 * correct filename is "测试.7z", English word "test". * "测试" GBK encoded characters is '\xb2\xe2\xca\xd4', test in python shell `u'测试'.encode('GBK')`. In my host system, Fedora Linux, shell terminal using utf-8 encoding to display text. I run: $ curl -i 'http://192.168.0.101/测试.7z' HTTP/1.1 200 OK Content-Type: application/octet-stream Content-Length: 0 Accept-Ranges: bytes Server: HFS 2.2f Content-Disposition: attachment; filename="????.7z"; Last-Modified: Fri, 22 Feb 2013 09:18:33 GMT Note "????.7z", question mark, because terminal can't display it. I use xxd to see the filename string hex 0000090: 656e 743b 2066 696c 656e 616d 653d 22b2 ent; filename=". 00000a0: e2ca d42e 377a 223b 0d0a 4c61 7374 2d4d ....7z";..Last-M "b2e2cad4", actually hfs encoding it in GBK, and in this format, it should be utf-8. I upload the curl command dump as attechment, use nc command to run fake http server. $curl -i 'http://192.168.0.101/测试.7z' > httpdump $ nc -l 8000 < httpdump So you can test it easily. --- After google, I found Content-Disposition format is messy: http://greenbytes.de/tech/tc2231/ It look like relate this testcase: http://greenbytes.de/tech/tc2231/#attwithutf8fnplain But why Firefox for Windows don't have this problem in this case? So I still think is a bug. Maybe Firefox can use system default encoding as fallback, My Linux default encoding(env LANG) is en_US.UTF-8, still is utf-8, still fail. But XP is zh_CN.GB18030, so Firefox can decode it correctly. So, I run Firefox on Linux with $ LANG=zh_CN.GB18030 firefox Test it again, this time download dialog can display file correctly, and also right in download panel, but on file system, it still "????.7z", and nautilus file manager warn it "invalid encoding". I run another Windows XP in virtual machine, Change the system language/region in Control Panle to "China Taiwan", which encoding is "zh_TW.BIG5", then use Firefox download the file again, this time, Firefox also display a wrong name. So it look like Firefox actually try to decode the filename with system default encoding. But utf-8 is recommend to Linux/Mac. Of course, server developer should follow the web standard to use utf-8, but still many website only service for local region users, they don't care what Linux/Mac/UTF-8 is. It is annoying, I suggest Firefox can provide a fallback encodings settings in about:config, something like "download.filename.fallback_encodings", let Linux/Mac user edit it when they need. Chrome also fail in this case, but look like it's fallback is extract a filename from the url, in some case it just display "download".
Component: General → File Handling
Product: Firefox → Core
Sending non-USASCII characters in filename parameters isn't portable. What needs to be fixed is the server to use the encoding defined in RFC 5987, after which it will work in *all* current browsers.
+1 every chinese that use firefox, the time when one downs something the problem comes, wishes the firefox official to fix it as quick as soon! thx.
Product: Core → Firefox
Version: 19 Branch → unspecified
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: