Open Bug 1318970 Opened 8 years ago Updated 2 years ago

Some location addresses don't save to browser.upload.lastDir in content-prefs.sqlite correctly

Categories

(Toolkit :: General, defect)

defect

Tracking

()

People

(Reporter: fireattack, Unassigned)

Details

Attachments

(3 files)

User Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.52 Safari/537.36 Steps to reproduce: 1. Create two folders called "test123" and "新建文件夹" respectively, and put them somewhere (e.g. E:\test123 and E:\新建文件夹). 2. Put some image files in both folders. 3. Go http://imgur.com/ and upload an image from folder "E:\test123" by using New Post->Browse (make sure you used Firefox's "Browse.." dialog) 4. Try to upload another image by "Browse.." dialog. You can see the the last used address "E:\test123" is correctly used. 5. (Optional) Browse the content-prefs.sqlite file. You can see the last used location is saved in "prefs" table, "value" column. 5. Now, upload a file via "Browse.." from "E:\新建文件夹". 6. When done, try to upload another one to trigger the "Browse.." dialog. Actual results: The default location in "Browse.." is not your last used one, "E:\新建文件夹" but "E:". If you check content-prefs.sqlite, it has the same problem. Expected results: It should be "E:\新建文件夹". I can't reproduce it with other Chinese characters, I have no idea why "新建文件夹" (lit. “new folder”, the default name when you create a folder in Windows) is special.
Summary: Some location address doesn't save to content-prefs.sqlite correctly → Some location addresses don't save to content-prefs.sqlite correctly
browser.download.lastDir doesn't seem to have this problem.
Summary: Some location addresses don't save to content-prefs.sqlite correctly → Some location addresses don't save to browser.upload.lastDir in content-prefs.sqlite correctly
Can't reproduce with "新建文件夹2", either.
Has STR: --- → yes
Component: Preferences → HTML: Form Submission
Product: Toolkit → Core
It stores Chinese characters as an unfamiliar encoding, for example when storing the "E:\测试文件", it uses the E5A8B4E5ACADE798AFE98F82E59BA6E6ACA2.
Hi :adw, this looks related with nsIContentPrefService2::Set(). Do you have ideas about what's happening here? Thank you.
Component: HTML: Form Submission → General
Flags: needinfo?(adw)
Product: Core → Toolkit
(In reply to YF (Yang) from comment #4) > It stores Chinese characters as an unfamiliar encoding, for example when > storing the "E:\测试文件", it uses the E5A8B4E5ACADE798AFE98F82E59BA6E6ACA2. in python3 shell: >>> import binascii >>> b = binascii.a2b_hex('E5A8B4E5ACADE798AFE98F82E59BA6E6ACA2') >>> b b'\xe5\xa8\xb4\xe5\xac\xad\xe7\x98\xaf\xe9\x8f\x82\xe5\x9b\xa6\xe6\xac\xa2' >>> b.decode('utf-8') '娴嬭瘯鏂囦欢' >>> b.decode('utf-8').encode('gb18030').decode('utf-8') '测试文件' >>> u = '测试文件' >>> u.encode('utf-8').decode('gb18030') '娴嬭瘯鏂囦欢' >>> u.encode('utf-8').decode('gb18030').encode('utf-8') b'\xe5\xa8\xb4\xe5\xac\xad\xe7\x98\xaf\xe9\x8f\x82\xe5\x9b\xa6\xe6\xac\xa2' >>> binascii.b2a_hex(u.encode('utf-8').decode('gb18030').encode('utf-8')) b'e5a8b4e5acade798afe98f82e59ba6e6aca2'
I'm not sure why YF was testing "测试文件" to begin with, that string doesn't cause any problem. After more test, I found that every folder end up with "文件夹" would trigger this bug.
Ah never mind, thanks for uFFFD's steps. So I guess any string end up with odd number of bytes in UTF-8(typically odd number of Chinese character, such as "测试文", "文件夹" or even Kana such as "ばかね") would cause this problem, then. But the string needs to be CJK-only for some reason. E.g. 文件夹test, despite has 13 bytes in UTF-8, doesn't have this problem.
Attached file observer.xpi
an observer for nsIContentPrefService and nsIContentPrefService2 that will log all preference value changes and deletions to browser console
here is a simplified STR: 1. load observer.xpi from comment 10 via "Load Temporary Add-on" in about:debugging page 2. open test.html from comment 9 3. click "Browse..." to bring up "File Upload" dialog 4. select any file under a path that contains at least one character which has a unicode code value > 0x7F 5. check logs in browser console to see how browser.upload.lastDir is messed up this bug affects all windows, may affect linux, and is unlikely to affect macOS
this bug is caused by trying to convert an UTF8 encoded bytestring to unicode string using "The system default Windows ANSI code page", which "can be different on different computers" (e.g. gbk default on simplified chinese ver. windows, big5 default on traditional chinese ver. windows, shift_jis default on japanese ver. windows), and it "can be changed on the same computer" it was introduced by bug 832923 as of today, the most up to date revision on DXR is 8103c612b79c i will quote related source code of this revision below: ------------------------------------------------------------------------------ https://dxr.mozilla.org/mozilla-central/rev/8103c612b79c2587ea4ca1b0a9f9f82db4b185b8/dom/html/HTMLInputElement.cpp#1095-1135 nsresult UploadLastDir::StoreLastUsedDirectory(nsIDocument* aDoc, nsIFile* aDir) { ... // Find the parent of aFile, and store it nsString unicodePath; aDir->GetPath(unicodePath); if (unicodePath.IsEmpty()) // nothing to do return NS_OK; RefPtr<nsVariantCC> prefValue = new nsVariantCC(); prefValue->SetAsAString(unicodePath); ... return contentPrefService->Set(spec, CPS_PREF_NAME, prefValue, loadContext, nullptr); } this is where nsIContentPrefService2 gets involved in unicodePath is expected to be an unicode string however, aDir is already broken at this point thus unicodePath is broken too ------------------------------------------------------------------------------ https://dxr.mozilla.org/mozilla-central/rev/8103c612b79c2587ea4ca1b0a9f9f82db4b185b8/dom/html/HTMLInputElement.cpp#616-622 NS_IMETHODIMP HTMLInputElement::nsFilePickerShownCallback::Done(int16_t aResult) { ... // Collect new selected filenames nsTArray<OwningFileOrDirectory> newFilesOrDirectories; ... // Store the last used directory using the content pref service: nsCOMPtr<nsIFile> lastUsedDir = LastUsedDirectory(newFilesOrDirectories[0]); if (lastUsedDir) { HTMLInputElement::gUploadLastDir->StoreLastUsedDirectory( mInput->OwnerDoc(), lastUsedDir); } ... } this is the only place UploadLastDir::StoreLastUsedDirectory being called the path of every element in newFilesOrDirectories is an unicode string ------------------------------------------------------------------------------ https://dxr.mozilla.org/mozilla-central/rev/8103c612b79c2587ea4ca1b0a9f9f82db4b185b8/dom/html/HTMLInputElement.cpp#476-510 static already_AddRefed<nsIFile> LastUsedDirectory(const OwningFileOrDirectory& aData) { if (aData.IsFile()) { nsAutoString path; ErrorResult error; aData.GetAsFile()->GetMozFullPathInternal(path, error); if (error.Failed() || path.IsEmpty()) { error.SuppressException(); return nullptr; } nsCOMPtr<nsIFile> localFile; nsresult rv = NS_NewNativeLocalFile(NS_ConvertUTF16toUTF8(path), true, getter_AddRefs(localFile)); if (NS_WARN_IF(NS_FAILED(rv))) { return nullptr; } nsCOMPtr<nsIFile> parentFile; rv = localFile->GetParent(getter_AddRefs(parentFile)); if (NS_WARN_IF(NS_FAILED(rv))) { return nullptr; } return parentFile.forget(); } ... } LastUsedDirectory is expected to return the parent directory of aData's path so it convert aData's path from unicode (same as UTF16 for most chars <= U+FFFF) to UTF8, pass it to NS_NewNativeLocalFile to create a temporary nsIFile localFile, and then return localFile's parent directory ------------------------------------------------------------------------------ https://dxr.mozilla.org/mozilla-central/rev/8103c612b79c2587ea4ca1b0a9f9f82db4b185b8/xpcom/io/nsLocalFileWin.cpp#3654-3665 nsresult NS_NewNativeLocalFile(const nsACString& aPath, bool aFollowLinks, nsIFile** aResult) { nsAutoString buf; nsresult rv = NS_CopyNativeToUnicode(aPath, buf); if (NS_FAILED(rv)) { *aResult = nullptr; return rv; } return NS_NewLocalFile(buf, aFollowLinks, aResult); } the implement of NS_NewNativeLocalFile for windows will convert aPath back to unicode first remember that aPath is an UTF8 encoded bytestring passed from LastUsedDirectory ------------------------------------------------------------------------------ https://dxr.mozilla.org/mozilla-central/rev/8103c612b79c2587ea4ca1b0a9f9f82db4b185b8/xpcom/io/nsNativeCharsetUtils.cpp#910-940 nsresult NS_CopyNativeToUnicode(const nsACString& aInput, nsAString& aOutput) { uint32_t inputLen = aInput.Length(); nsACString::const_iterator iter; aInput.BeginReading(iter); const char* buf = iter.get(); // determine length of result uint32_t resultLen = 0; int n = ::MultiByteToWideChar(CP_ACP, 0, buf, inputLen, nullptr, 0); if (n > 0) { resultLen += n; } // allocate sufficient space if (!aOutput.SetLength(resultLen, fallible)) { return NS_ERROR_OUT_OF_MEMORY; } if (resultLen > 0) { nsAString::iterator out_iter; aOutput.BeginWriting(out_iter); char16_t* result = out_iter.get(); ::MultiByteToWideChar(CP_ACP, 0, buf, inputLen, wwc(result), resultLen); } return NS_OK; } the implement of NS_CopyNativeToUnicode for windows calls winapi MultiByteToWideChar to convert a multibyte encoding encoded bytestring to unicode string but it's hard coded to use CP_ACP only, which means "The system default Windows ANSI code page" since NS_NewNativeLocalFile passes an UTF8 encoded bytestring to NS_CopyNativeToUnicode trying to decode it using a non-UTF8 encoding will certainly break the result see the MSDN page of MultiByteToWideChar for more details: https://msdn.microsoft.com/en-us/library/windows/desktop/dd319072(v=vs.85).aspx ------------------------------------------------------------------------------
(In reply to fireattack from comment #8) > So I guess any string end up with odd number of bytes in UTF-8(typically odd > number of Chinese character, such as "测试文", "文件夹" or even Kana such as > "ばかね") would cause this problem, then. > > But the string needs to be CJK-only for some reason. E.g. 文件夹test, despite > has 13 bytes in UTF-8, doesn't have this problem. the reason why browser.upload.lastDir seems to work sometimes is that firefox uses GetOpenFileNameW to create open file dialog on windows GetOpenFileNameW has some fallback mechanisms to choose another initial directory when the given initial one (OPENFILENAME.lpstrInitialDir) does not exist or is invalid see https://msdn.microsoft.com/en-us/library/windows/desktop/ms646839(v=vs.85).aspx for details for example, on a simplified chinese version windows if you select "C:\新建文件夹\test" it becomes "C:\鏂板缓鏂囦欢澶筡test" after encoding/decoding described in comment 12 its parent directory is "C:", which does exist and is valid so GetOpenFileNameW will open "C:" next time if you select "C:\新建文件\test" it becomes "C:\鏂板缓鏂囦欢\test" after encoding/decoding its parent directory is "C:\鏂板缓鏂囦欢", which is valid but maybe not exist so GetOpenFileNameW will open "C:\鏂板缓鏂囦欢" if it does exist else open most recently used path, current directory, personal files directory of the current user, or Desktop folder instead if you select "C:\新建文件夹2\test" it becomes "C:\鏂板缓鏂囦欢澶?\test" after encoding/decoding its parent directory is "C:\鏂板缓鏂囦欢澶?", which is invalid since "?" is not allowed in path name so GetOpenFileNameW will open most recently used path, current directory, personal files directory of the current user, or Desktop folder it may seem to work in the last two cases but it does not actually
Flags: needinfo?(adw)
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: