<a class="header-button" href="https://bugzilla.mozilla.org/home" title="Go to home page"> Bugzilla

Assignee

•

24 years ago

Attached patch backing up my work. about half of the string params are complete — Details — Splinter Review

Comment 33

•

24 years ago

- nsAutoString fileURLUnicode; fileURLUnicode.AssignWithConversion(docURLSpec); - res = webShell->SetURL(fileURLUnicode.get()); + res = webShell->SetURL(docURLSpec.ToNewUnicode()); FYI, this (kind of) change will leak the newly created unicode buffer. That's the reason they're doing this whole thing with nsAutoString and AssignWithConversion. What you could do is something like: + res = webShell->SetURL(NS_ConvertASCIItoUCS2(docURLSpec).get()); This will use the exact same conversion, though I think they should be using NS_ConvertUTF8toUCS2 here, but that's a different bug. Also, the member functions ToNewUnicode, ToNewCString and ToNewUTF8String are deprecated. Instead you should be using the global functions in nsReadableUtils.h. Thus: *aResult = myString.ToNewUnicode(); becomes (don't forget to #include nsReadableUtils.h): *aResult = ToNewUnicode(myString); I have a patch which switches all existing uses over from the one to the other form. I hope to check that in soonish (need to get r= and sr= first of course).

Assignee

Comment 34

•

24 years ago

Jag, thanks for catching those. I'll fix them now. Yes, i only used ToNewUnicode() in one place. Is the ultimate plan to get rid of char*'s altogether and move everything over to nsAString? Thanks --pete

Comment 35

•

24 years ago

> Is the ultimate plan to get rid of char*'s altogether and move everything over > to nsAString? Yes. With the exception of persistentDescriptor. See my comment from 08/22. A persistentDescriptor shouldn't be mistaken for or used like a path. It may look like one on Unix & Windows but on the Mac, it's a base64 encoded file alias. It's supposed to be treated as opaque data which is pointed to by a char*. Also, there are many of these written into existing registries and prefs so this can't be changed. Not to nag about this again but since I pointed it out last, it's still a wide string in the latest patch.

Assignee

Comment 36

•

24 years ago

Conrad, i was actually referring to the entire mozilla codebase, not just nsIFile in particular. It seems a lot of the surrounding implementations i see would be better suited using nsAutoString. Reguards to persistentDescriptor are you saying it needs to be a wide string and not a regular string like it is now? I haven't changed the interface for persistentDescriptor. http://lxr.mozilla.org/seamonkey/source/xpcom/io/nsILocalFile.idl#107 --pete

•

24 years ago

The good new is i have almost all of the param changes finished. It compiles cleanly. The bad news is a can only get xpcshell to run. I am having a problem. I beleive somewhere there is a call to free or alloc that is passing the wrong pointer ala . . . in free(): warning: junk pointer, too high to make sense So i have pretty much been in regression mode. If anyone can spot an error in the huge patch i attached that would be great. I have been up and down it and anywhere i removed a call to free, the char* was replaced w/ a nsAutoString. So, anyway progress is halted while i track this bug down. Thanks --pete

Comment 42

•

24 years ago

Index: editor/base/nsEditorShell.cpp + nsAutoString fileURL; + fileURL.Assign(inFileURL); Code like that you can short circuit to: nsAutoString fileURL(inFileURL); But, since your only use of the nsAutoString there is to be a string container around the const PRUnichar* passed in, you really want nsDependentString. nsXPIDLString leafName; - docFile->GetUnicodeLeafName(getter_Copies(leafName)); + docFile->GetLeafName(leafName); So, this might compile (shame on us) but currently won't work (it will at some point in the future). /me goes to stand in a corner.

Assignee

Comment 43

•

24 years ago

Heres on that will issue the error. myXPIDLCString.Adopt((char*)NS_ConvertUCS2toUTF8(myAuto).get()); needs to be: myXPIDLCString.Adopt(ToNewCString(myAuto)); --pete

Comment 44

•

24 years ago

Yep, that would do the trick :-) Nice catch. Whenever you need to cast with strings, ask yourself what's wrong. Could be the interface you're using forces you to use a |char*| even though they really meant a |const char*|, could be you're using strings differently than you should.

Assignee

•

24 years ago

Here we go. Making an assignment like this is bad: leafName = (char*)NS_ConvertUCS2toUTF8(autoleaf).get(); I change to this: leafName = ToNewCString(autoLeafName); We are good. No more of free junk pointers errors from malloc. I have a bunch of these i need to fix. --pete

Comment 49

•

24 years ago

Doh! I should've pointed you at our current string guide/FAQ, it lists most if not all of these. http://lxr.mozilla.org/seamonkey/source/string/doc/string-guide.html

Assignee

Comment 50

•

24 years ago

Jag, i knew about the guide. I just need to read it. ;-) --pete

Comment 51

•

24 years ago

> Fix all accessor code that is implemented poorly. Your example is a good one. > Let it bake and check it in. Most of these cases need to be fixed before this gets checked in. If we fail to open the registry file on a system where the path contains a non-ASCII char, or lose people's mail, that's a showstopper. Many occurances of NS_ConvertUCS2toUTF8 in the patch are likely to cause this. This is a huge amount of work and you've already done a lot. I don't mean to heap more on you in order to get this done. I'd say file bugs against others and make them block this one. One stopper is a conversion like this: + nsAutoString pathBuf; + aFile->GetPath(pathBuf); res = NS_NewFileSpec(getter_AddRefs(tempSpec)); if (NS_FAILED(res)) return res; - res = tempSpec->SetNativePath(pathBuf); + res = tempSpec->SetNativePath(NS_ConvertUCS2toUTF8(pathBuf).get()); Ideally, nsFileSpec would die and we wouldn't have this problem. There is a bug on that and it could block this one. In the short term, there is a workaround: nsFileSpec has an assignment operator which takes an nsString&. If pathBuf was just assigned to an nsFileSpec, it goes through Unicode conversion to the FS charset.

Assignee

•

24 years ago

jag, in this latest patch, i have about 19 errors from malloc. mozilla-bin in free(): warning: junk pointer, too high to make sense. Can you give it a once over and see if you can catch any. I'm still lookin. I also got rid of any calls i made to AssignWithConversion. Thanks --pete

Judson Valeski

Comment 61

•

24 years ago

these mods are far reaching enough that they warrant a branch IMO. once you're ready, people can pull that branch on various platforms and beat on it. can you head toward a branch?

Assignee

•

24 years ago

Attached file nsLocalFileUnixDefs — Details

Updated

•

24 years ago

Blocks: 100676

Assignee

Comment 74

•

24 years ago

pushing back

Target Milestone: mozilla0.9.5 → mozilla0.9.7

Assignee

•

24 years ago

NS_NewLocalFile(const char* path, PRBool followLinks, nsILocalFile* *result); openANSIFileDesc(in string mode); NS_GetSpecialDirectory(const char* specialDirName, nsIFile* *result) attribute string persistentDescriptor; These are the methods i have left to convert. persistentDescriptor remains as string. What about the others? ok to convert over to AString? Thanks --pete

•

24 years ago

Conrad, i think what we are shooting for in this bug is landing it in stages. 1- API change 2- removing FileSpec 3- removing uconv from io Sound good? --pete

Comment 85

•

24 years ago

I was thinking it would go in the reverse order from that. If we change all methods on nsIFile to be Unicode-only, we are forcing all consumers to use methods which go through Unicode -> FS charset conversion which has problems. Getting rid of that conversion is, to me, the reason for making this API change, but that requires change to the back-end impls of nsLocalFile.

Assignee

•

24 years ago

Heh, looks like mysql can't deal w/ it either. Oh well. Anyway the point i'm trying to make is we can write this out to the file system now, it's up to the systems locale to interpret it. For unix using UTF-8 should the reasonable solution in this case. --pete

Comment 92

•

24 years ago

ack, that just seems lame. Let's say I do a File->Save Page As... and save it as my name, but my name is in Japanese. You're proposing we ignore the system locale (right?) and save it in UTF8. Now I go to open it with emacs, and look for my file. Instead of seeing my name, I see some garbled string. that's not cool.

Assignee

Comment 93

•

24 years ago

Here is the problem as i see it. If a local system doesn't support unicode conversions natively. This is the case on many unix systems pre glib 2.2. We have a mozilla string that has been converted to UCS2. Our strings original locale charset mapping is now gone in place of unicode. We are now ready to convert this string back and pass it off to a libc call. We have no native converters. If i use wcstombs it is expecting the first conversion was made based on the systems locale where mozilla converted using ISO 10646 (i beleive). The resulting conversion back to mutlibyte string will be incongruous. The systems that support a native unicode conversion should be fine. The problem is that a large number of systems out there don't. At least w/ UTF-8 there is a shot that if the system is ISO_8859-1, the chars will map correctly. If your locale is Japanese, you have the glyphs and are running in UTF-8 you'll see your file. I just see a great deal of platform specific work to accomodate the few unix systems that have good unicode support implemented. Comments from any unix hackers out there? --pete

Mike Shaver (:shaver emeritus)

Comment 94

•

24 years ago

adding shaver for unix filesystem advice

•

24 years ago

Looking over the original goals why don't we revise the objective - NS_FILE_CONTRACTID should go away (IMPLEMENTED) - methods w/ [const] can remove those usages as "in" parameterspecification implies "const" (IMPLEMENTED) - remove all foo*FollowLinks and use boolean flag followLinks (IMPLEMENTED) - fully document semantics of the methods. make assumptions clear (TODO) - spawn should go away. (IMPLEMENTED) - nsILocalFile derrives from nsIFile (IMPLEMENTED) - Write intensive test suite, debug fix and stablize (STARTED) I have most of this implemented on unix already. I will obviously have to redo everything to revert back to char * params. But these ojectives make more sense to me. --pete

Comment 99

•

24 years ago

> Can someone tell me the big win here using double byte chars by default? It avoids bad conversions between Unicode and the current system char set on systems where the file system *is* Unicode. When we want to store the path to a file in, say, the profile registry, we use Unicode paths. When the Unicode path is later read from the registry and then passed to InitWithUnicodePath, the Unicode path is converted into a char* path by code in nsLocalFileCommon. If the system char set when InitWithUnicodePath is called is different from when the path was retrieved by GetUnicodePath, that conversion will be wrong. If the API is Unicode-only and the implementation uses Unicode FS routines, we never have to do any conversions. If that's not possible on Unix like it is on Windows (NT, 2K) and Mac (OS9, OSX) those implementations can be changed and weaned off of the conversion code in nsLocalFileCommon.

Assignee

Comment 100

•

24 years ago

Agree, that makes sense. On unix we can do it for glib 2.2 systems as well. But that implies we are still keeping the unicode API's. Then we are just getting rid of the uconv dependencies where possible. Ok, sorry for the misunderstanding. --pete

Assignee

Comment 101

•

24 years ago

How about a solution like this for unix systems. // Unicode interface Wrapper NS_IMETHODIMP nsLocalFile::InitWithUnicodePath(const PRUnichar *filePath) { wchar_t wpath[BUFSIZ]; char path[BUFSIZ]; size_t len = 0; while (filePath[len] != L'\0') wpath[len] = filePath[len++]; wpath[len] = L'\0'; wcstombs(path, wpath, BUFSIZ); return InitWithPath(path); } Can anyone see any problems w/ this? It is just converting the PRUnichar to the system wchar. Then wcstombs can take care of the systems current locale. Thanks --pete

Assignee

Comment 102

•

24 years ago

Turning it into a macro we can ifdef the hooks for each system in nsLocalFileCommon or should we just implement them in each platform nsLocalFile*.cpp file? #define SET_UNIX_WS( func , arg ) \ { \ wchar_t wpath[BUFSIZ]; \ char path[BUFSIZ]; \ size_t len = 0; \ while (arg[len] != L'\0') \ wpath[len] = arg[len++]; \ wpath[len] = L'\0'; \ wcstombs(path, wpath, BUFSIZ); \ return (func)(path); \ }

Brendan Eich [:brendan]

Comment 103

•

24 years ago

pete: don't do len++ in the right-hand-side of an assignment to foo[len] if you want the index to be the pre-incremented value. (Also, nit: 3-space indent, then 2?!). Can we tune for the case where sizeof(wchar_t) == sizeof(PRUnichar) and just cast? /be

Comment 104

•

24 years ago

> or should we just implement them in each platform nsLocalFile*.cpp file? Yes. On other platforms, the file system takes unicode directly so we don't need to do this kind of conversion. Separating out the unicode wrapper routines is bug 103384.

Assignee

Comment 105

•

24 years ago

> if you want the index to be the pre-incremented value. In this case Brendan we don't want the pre increment, we are starting at 0. > (Also, nit: 3-space indent, then 2?!). I thought "when in Rome" . . . > Can we tune for the case where sizeof(wchar_t) == sizeof(PRUnichar) > and just cast? A more complete macro: #define SET_WS( func , arg ) \ { \ char path[BUFSIZ]; \ if (sizeof(wchar_t) == sizeof(PRUnichar)) { \ if (wcstombs(path, (wchar_t *)arg, BUFSIZ) < 0) \ return NS_ERROR_FAILURE; \ } else { \ wchar_t wpath[BUFSIZ]; \ size_t len = 0; \ while (arg[len] != L'\0') \ wpath[len] = arg[len++]; \ wpath[len] = L'\0'; \ if (wcstombs(path, wpath, BUFSIZ) < 0 ) \ return NS_ERROR_FAILURE; \ } \ return (func)(path); \ }

Comment 106

•

24 years ago

This won't work in the case where we force wchar_t to be short (-fshort-wchar, see http://lxr.mozilla.org/seamonkey/source/configure.in#1968).

Brendan Eich [:brendan]

Comment 107

•

24 years ago

Intentional error returns are: NS_ERROR_FILE_INVALID_PATH NS_ERROR_FILE_ALREADY_EXISTS If you see any other errors in the test output, then there is a problem. --pete

Comment 128

•

24 years ago

Hey conrad - check out bug 110371 - I found with nsFileSpec that there are actually very few consumers of the two routines that required conversion, and they were all in mail. The same may be true of the nsFileSpec routines in nsLocalFileUnicode. once I land bug 110371, I'll take a look at these too.

Assignee

Updated

•

24 years ago

Blocks: 99160

Assignee

Comment 129

•

24 years ago

pushing back

Target Milestone: mozilla0.9.7 → mozilla0.9.9

Comment 130

•

24 years ago

Is this really going to land for .99. Or can we continue to use and support the current nsIFile?

Comment 131

•

24 years ago

Comment on attachment 54447 [details] [diff] [review] proposed implementation for windows So in general I like what's going on here. Why don't we #ifdef stuff out and get this landed on a platform-by-platform basis? sr=alecf on the windows stuff

Attachment #54447 - Flags: superreview+

Comment 132

•

24 years ago

Comment on attachment 54446 [details] [diff] [review] added some logic to test for null pointers, got rid of damn three spaces this part looks good but where is nsLocalFileUnixDefs.h? (or is it in a patch above? I didn't feel like checking them all)

Attachment #54446 - Flags: superreview+

Assignee

Comment 133

•

24 years ago

We can't land this stuff at this point. This is to big a change IMHO way to risky. At the time i posted the patch back in October, it was humming nicely and pretty well tested on unix. I have since landed nsIFile stuff that will break these patches. I only did the unix stuff. The windows and mac need *lots* of love. I don't have access to a win or mac to develop which sucks. ;-( Not sure how the wide char conversions will fly on all flavors of unix either. If i had full time to devote to this, doing it wouldn't even be an issue. But w/ my current rate of one small patch per week, there is no way i can make it fly. If you guys want to go for it, i can handle the unix end of things. I will need to get a new patch in order as this one will certainly fail. --pete

Comment 134

•

24 years ago

ok, I'll see what I can do to at least get windows landed.. then we can procede with linux and mac.

Comment 135

•

24 years ago

I'm putting together the Mac equivalent of attachment 54447 [details] [diff] [review].

Assignee

•

24 years ago

The reason it was in this bug, is because using AStrings in nsIFile methods is an API change. ;-) --pete

Comment 140

•

24 years ago

this is true. I guess I was using that bug as part tracking bug, part minor-cleanup-patch-container :)

Comment 141

•

24 years ago

pete how much of this is left to go? Does this bug cover removal of the UNICODE function in favor of a soon-to-be-here utf8 string (which is the only change that I want to see)?

Assignee

Comment 142

•

24 years ago

We still need to land bug 100676 on unix. I am waiting for testing and review. Since everything in this bug is being done incrementally in seperate bugs, perhaps we should break out removing the unicode methods and using AStrings as params into it's own bug. --pete

Comment 143

•

24 years ago

Here's what I'd like to see happen: 1. Combine all the char*/PRUnichar* pairs of methods to one method which takes an nsAString (or the soon-to-be utf8 string doug mentioned) 2. Drop the ...FollowingLinks methods and make sure the implementations actually observe the followLinks attribute. 3. On nsILocalFile, move reveal() and launch() to some service. (Maybe) 4. On the directoryEntries attribute, be specify whether the entries that are returned by the enumerator have symlinks resolved. This is independent of the followLinks attribute of the parent. The items returned should not just inherit this attribute from their parent. 5. Add a relativeDescriptor attribute to nsILocalFile. This is needed to get XP, relative paths in prefs and such instead of the absolute native paths being used now. Anything else? I'd like to get nsIFile.idl and nsILocalFile.idl nailed down soon as a roadmap for other bugs.

Comment 144

•

•

24 years ago

Because the component mgr is doing that with native paths. The point of what I'm after is that it's XP, i.e. a prefs file written on Unix is readable on a Mac.

Assignee

Comment 155

•

24 years ago

I can work on (2), unix/windows If we aren't going set followLinks when we initWithPath, then we need to make it a setter. Right now it is readonly. FollowLinks has been broken for a very long time, i wonder what kind of surprises we'll find when it actually works. eg: when followLinks is fixed and is set to true, when you call GetPath and the file is a symlink, you will get the links ultimate target which is *not* the behavior we have now. As shaver would say, "I'll eat my hat" if this doesn't cause some regressions somewhere. Just a heads up. --pete

Assignee

Comment 156

•

•

24 years ago

I dont by that arguement. The problem with is is that we do offer a persistant description which can be used and that should preserve any and all "unicode" state and do the right stuff if in your above example. We will **not** provide a unicode only interface. Most users just want char*'s. If we have to make any change, we will remove the unicode cruft and make these API pass UTF8 encoded string. >If we can't get rid of the char*/PRUnichar* duality mess that we currently have on nsIFile, the API is not worthy of being frozen. Worthyness has nothing to do with it. Looks at the windows API where there are wide and narrow APIs. Dualities exist. My thoughts are that we either do nothing or we removed UNICODE and convert to UTF8. I am leaning toward the former.

Assignee

Comment 160

•

24 years ago

> My thoughts are that we either do nothing Doing nothing means that we take what we have now, fix it, stress test it then bake it for 5 weeks at 400 degrees. > (1) Boot your machine in JA locale and create some files with multi-byte chars. > (2) Boot again in US locale. > (3) Try and use the files. Agree this is certainly a problem, but in reality, how many people will be actually doing this? I think it needs to be clearly spelled out as a known 1.0 local file issue. I personally beleive that we will gain so much more by taking what we have and use this time to make it rock solid. --pete

Comment 161

•

Updated

•

24 years ago

Target Milestone: mozilla1.0 → Future

Asa Dotzler [:asa]

Comment 169

•

24 years ago

mozilla1.0- in favor of 99160

Keywords: mozilla1.0+ → mozilla1.0-

saari (gone)

Updated

•

24 years ago

Keywords: topembed → embed, topembed-