Non-ascii directory name is garbled in directory index

RESOLVED FIXED

Status

()

Core
Networking: File
RESOLVED FIXED
12 years ago
12 years ago

People

(Reporter: emk, Assigned: emk)

Tracking

({fixed1.8.1, intl, jp-critical})

Trunk
x86
Windows XP
fixed1.8.1, intl, jp-critical
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 9 obsolete attachments)

(Assignee)

Description

12 years ago
Reproducable: Always
Steps to reproduce:
1. Open the file:///C:/(Any directory containing non-ascii chars)/
Actual result:
Non-ascii chars are garbled.
Expected result:
Non-ascii chars should not be garbled.

This is a regression of bug 278161.
(Assignee)

Comment 1

12 years ago
Created attachment 218163 [details] [diff] [review]
Patch
Assignee: nobody → VYV03354
Status: NEW → ASSIGNED
Attachment #218163 - Flags: review?(jshin1987)

Comment 2

12 years ago
Comment on attachment 218163 [details] [diff] [review]
Patch

I think it's better to get rid of the following
http://lxr.mozilla.org/seamonkey/source/netwerk/streamconv/converters/nsIndexedToHTML.cpp#229

229         // reset parser's charset to platform's default if this is file url
230         nsCOMPtr<nsIPlatformCharset> platformCharset(do_GetService(NS_PLATFORMCHARSET_CONTRACTID, &rv));
231         NS_ENSURE_SUCCESS(rv, rv);
232         nsCAutoString charset;
233         rv = platformCharset->GetCharset(kPlatformCharsetSel_FileName, charset);
234         NS_ENSURE_SUCCESS(rv, rv);
235         rv = mParser->SetEncoding(charset.get());
236         NS_ENSURE_SUCCESS(rv, rv);

 
>@@ -496,17 +501,17 @@ 
>     nsXPIDLCString encoding;
>     rv = mParser->GetEncoding(getter_Copies(encoding));
>     if (NS_FAILED(rv)) return rv;
> 
>     nsXPIDLString unEscapeSpec;
>     rv = mTextToSubURI->UnEscapeAndConvert(encoding, loc,
>                                            getter_Copies(unEscapeSpec));

Why don't you need the same change here as in the first chunk? Is the index in the 'native' encoding?
Attachment #218163 - Flags: review?(jshin1987) → review-
(Assignee)

Comment 3

12 years ago
(In reply to comment #2)
> (From update of attachment 218163 [details] [diff] [review] [edit])
> I think it's better to get rid of the following
I tried, but it didn't work because

> Why don't you need the same change here as in the first chunk? Is the index in
> the 'native' encoding? 
index used the native encoding. See netwerk/base/src/nsDirectoryIndexStream.cpp.
We can't use Unicode index for all platforms until bug 99382 is fixed. Probably we could use GetLeafName only on Windows because the implementation of Windows is thread safe. Is it acceptable?
Depends on: 99382

Comment 4

12 years ago
(In reply to comment #3)

Thanks for the explanation.
 
> > Why don't you need the same change here as in the first chunk? Is the index in
> > the 'native' encoding? 
> index used the native encoding. See
> netwerk/base/src/nsDirectoryIndexStream.cpp.
> We can't use Unicode index for all platforms until bug 99382 is fixed. 

bug 229032 comment #46 and onward have some more discussion..

> Probably
> we could use GetLeafName only on Windows because the implementation of Windows
> is thread safe. Is it acceptable?

Perhaps, that's acceptable and even necessary.  OS X uses UTF-8 so that probably we're  better off using GetNativeLeafName. The same is true of 'modern' Linux (where the default is to use a UTF-8 locale).  After bug 99382 is fixed, I guess we have to do |if NS_IsNativeUTF8() .... else ....|
I'd hope that GetLeafName is threadsafe on all platforms...
(Assignee)

Comment 6

12 years ago
Hm, all GetLeafName implementations look already thread-safe.
Bug 99382 was too old to believe the comments blindly.
No longer depends on: 99382
(Assignee)

Comment 7

12 years ago
Created attachment 218453 [details] [diff] [review]
Using GetLeafName to make Unicode index

Although GetLeafName is usable for all platforms, it's still preferable to use GetNativeLeafName when platform charset is UTF-8.
GetLeafName is implemented as GetNativeLeafName + conversion on all platforms (except Windows). It will cause an extra conversion.
So I used NS_IsNativeUTF8 as jshin suggested.
Attachment #218163 - Attachment is obsolete: true
Attachment #218453 - Flags: review?(jshin1987)
(Assignee)

Comment 8

12 years ago
Created attachment 218455 [details] [diff] [review]
diff -w for review purpose only
(Assignee)

Comment 9

12 years ago
Created attachment 218605 [details] [diff] [review]
Removed ugly macros in expecting that bug 334167 will be fixed
Attachment #218453 - Attachment is obsolete: true
Attachment #218605 - Flags: review?(jshin1987)
Attachment #218453 - Flags: review?(jshin1987)
(Assignee)

Comment 10

12 years ago
Created attachment 218606 [details] [diff] [review]
diff -w
Attachment #218455 - Attachment is obsolete: true

Comment 11

12 years ago
Comment on attachment 218606 [details] [diff] [review]
diff -w

>+#if !defined(XP_BEOS) && !defined(XP_MACOSX)
>+    if (!NS_IsNativeUTF8()) {


>+#endif
>+#ifndef XP_WIN
>     nsCAutoString name1, name2;
>     aElement1->GetNativeLeafName(name1);
>     aElement2->GetNativeLeafName(name2);
>     
>     return Compare(name1, name2);
> #endif

I'm afraid you tried so hard to save the codesize that   it's kinda hard to read the code. Either add a detailed comment or just use 

if (NS_IsNativeUTF8()) 
   GetNativeLeafName ...
else
   GetLeafName ....

Saving < 100 bytes wouldn't be that much important, IMO.



>+            if (!NS_IsNativeUTF8()) {
>+#if !defined(XP_BEOS) && !defined(XP_MACOSX)
         }
>+#endif
>+            } else {
>+#ifndef XP_WIN


The same is the case here.
Keywords: intl, jp-critical
(Assignee)

Comment 12

12 years ago
Created attachment 219016 [details] [diff] [review]
Removed #ifdefs

Compilers (at least VC8) were enough smart to eliminate the dead code.
So we could satisfy both saving codesize and readability by trusting complier :)
I'm not sure about GCC, but maybe readablity is more important.
Attachment #218605 - Attachment is obsolete: true
Attachment #219016 - Flags: review?(jshin1987)
Attachment #218605 - Flags: review?(jshin1987)
(Assignee)

Comment 13

12 years ago
Created attachment 219017 [details] [diff] [review]
diff -w
Attachment #218606 - Attachment is obsolete: true

Comment 14

12 years ago
Comment on attachment 219017 [details] [diff] [review]
diff -w

>Index: netwerk/base/src/nsDirectoryIndexStream.cpp

>     // Note - we should be the collation to do sorting. Why don't we?

  s/be the collation/do the collation/ : it's not your bug but you're reindenting the line so that you'll get the blame. 


>@@ -122,23 +120,22 @@ static int PR_CALLBACK compare(nsIFile* 
>     /*PRInt32 res;
>     // CompareString takes an nsString...
>     nsString str1(name1);
>     nsString str2(name2);
> 
>     nsICollation* coll = (nsICollation*)aData;
>     coll->CompareString(nsICollation::kCollationStrengthDefault, str1, str2, &res);
>     return res;*/

Again, it's not you but you're touching the lines nonetheless to align the indentation.
For the commented-out code block, we'd better use |#if 0 .... #endif|. While reviewing the previous patch, at first I didn't realize that the block was commented out.


>+            if (!NS_IsNativeUTF8()) {
>+                nsAutoString leafname;
>+                rv = current->GetLeafName(leafname);
>             if (NS_FAILED(rv)) return rv;
>             if (!leafname.IsEmpty()) {

>+                    char* escaped = nsEscape(NS_ConvertUTF16toUTF8(leafname).get(), url_Path);
>+                    if (escaped) {
>+                        mBuf += escaped;
>                 mBuf.Append(' ');
>+                        nsCRT::free(escaped);
>             }
>-#else
>+                }
>+            } else {
>             nsCAutoString leafname;
>             rv = current->GetNativeLeafName(leafname);
>             if (NS_FAILED(rv)) return rv;
>             if (!leafname.IsEmpty()) {
>                 char* escaped = nsEscape(leafname.get(), url_Path);
>                 if (escaped) {
>                     mBuf += escaped;
>                     mBuf.Append(' ');
>                     nsCRT::free(escaped);

nsEscape uses nsMemory::Alloc so that you should use nsMemory::Free here. 

BTW, |if (!leafname.IsEmpty()) {....}| is common whether |NS_IsNativeUTF8| is true or false so that it can be refactored at the expense of a slight perf. loss on Windows. You may want to refactor it.

Comment 15

12 years ago
Comment on attachment 219016 [details] [diff] [review]
Removed #ifdefs

r=jshin with the above nits addressed
Attachment #219016 - Flags: review?(jshin1987) → review+
(Assignee)

Comment 16

12 years ago
Created attachment 219183 [details] [diff] [review]
resolved the nits

I do not want to sacrifice the perf, but I could still refactor some common codes anyway.
Attachment #219016 - Attachment is obsolete: true
Attachment #219183 - Flags: superreview?(darin)
Attachment #219183 - Flags: review+
(Assignee)

Comment 17

12 years ago
Created attachment 219184 [details] [diff] [review]
diff -w
Attachment #219017 - Attachment is obsolete: true

Comment 18

12 years ago
Comment on attachment 219183 [details] [diff] [review]
resolved the nits

>Index: netwerk/base/src/nsDirectoryIndexStream.cpp

>+#if 0
>+        PRInt32 res;
>+        // CompareString takes an nsString...
>+        nsString str1(name1);
>+        nsString str2(name2);
>+
>+        nsICollation* coll = (nsICollation*)aData;
>+        coll->CompareString(nsICollation::kCollationStrengthDefault, str1, str2, &res);
>+        return res;
>+#endif

Maybe this should just be removed.


>+            PRInt64 fileSize = LL_Zero();
>+            current->GetFileSize( &fileSize );

It's not necessary to use the LL_ macros and functions any longer.
You can just use "= 0" and do direct int64 arithmetic.


>+            PRInt64 tmpTime = LL_Zero();
>+            PRInt64 fileInfoModifyTime = LL_Zero();
>+            current->GetLastModifiedTime( &tmpTime );
>+            // Why does nsIFile give this back in milliseconds?
>+            LL_MUL(fileInfoModifyTime, tmpTime, PR_USEC_PER_MSEC);

Do this instead:

              PRInt64 fileInfoModifyTime = 0;
              current->GetLastModifiedTime( &fileInfoModifyTime );
              fileInfoModifyTime *= PR_USEC_PER_MSEC;


sr=darin
Attachment #219183 - Flags: superreview?(darin) → superreview+
(Assignee)

Comment 19

12 years ago
Created attachment 220069 [details] [diff] [review]
patch for check in
Attachment #219183 - Attachment is obsolete: true
Attachment #219184 - Attachment is obsolete: true
checked-in.
Status: ASSIGNED → RESOLVED
Last Resolved: 12 years ago
Resolution: --- → FIXED

Comment 21

12 years ago
Comment on attachment 220069 [details] [diff] [review]
patch for check in

Asking for a1.8 in advance

After a little baking in the trunk, this should be fixed on 1.8 branch as well. (bug 278161 was just fixed in the branch so that there will be a regression in the branch)
Attachment #220069 - Flags: approval-branch-1.8.1?(darin)

Updated

12 years ago
Attachment #220069 - Flags: approval-branch-1.8.1?(darin) → approval-branch-1.8.1+
Comment on attachment 220069 [details] [diff] [review]
patch for check in

this patch cannot apply to 1.8 branch.
(Assignee)

Comment 23

12 years ago
Created attachment 220527 [details] [diff] [review]
1.8 branch patch

No logic change.
carrying over r+sr+a1.8.1.
Attachment #220527 - Flags: superreview+
Attachment #220527 - Flags: review+
Attachment #220527 - Flags: approval-branch-1.8.1+
checked-in to 1.8 branch, thanks.
Keywords: fixed1.8.1
You need to log in before you can comment on or make changes to this bug.