Closed Bug 54740 Opened 24 years ago Closed 18 years ago

[MLK] foo(bar.ToNewCString()) leaks

Categories

(Core Graveyard :: Tracking, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED INVALID
mozilla1.1alpha

People

(Reporter: jag+mozbugs, Assigned: jag+mozilla)

References

()

Details

(Keywords: memory-leak)

Attachments

(4 files)

printf("%s\n", foo.ToNewCString()); // this leaks PL_strcpy(bar, foo.ToNewCString()); // this leaks The URL should find most of these in lxr. Help me fix this :-)
I think these can be cleaned up by changing |foo.ToNewCString()| to |NS_ConvertUCS2toUTF8(foo)| (or perhaps, if compilers are in a bad mood, |NS_ConvertUCS2toUTF8(foo).GetBuffer()|).
er, not |GetBuffer|, though. If you had to say something ... it would be |.get()|, which surprisingly doesn't exist. I'll add this soon.
scc, do you know when |get| will be checked in? Is there a bug for it?
looks good (as I said on IRC) r=scc (as the module owner); you'll probably need someone _other_ than me to give you a super-review ... perhaps waterson?
I'm frightened that .get() returns a "char*" instead of a "const char*". Why?
huh? It looks like it returns a |const char*| to me. Did I read the patch wrong?
Oops, you're right. It does return a const char*. a=waterson
Checked in the second patch, and I believe scc checked in the first patch a while back. Will start converting soon.
Keywords: rtm
Is this really happening for the RTM?? Seems like the activity is on the trunk. If this is intended for the RTM, can someone describe the size and frequency of the leaks?
Whiteboard: [need info]
Working on this, will attach a patch in a bit.
So far the only non-debug places affected, it seems, are extensions/cookie/nsCookie.cpp#1984, a number of PL_strcpy lines in profile/Acct/nsAccount.cpp, and a number of PL_strdup lines in profile/Acctidl/nsAccountServices.cpp, but those last two don't seem to be getting built.
Accepting this. Shhh, bugzilla.
Status: NEW → ASSIGNED
Well, the one problem I could find with this patch is that |permission_Add| consumes the string ... so |ToNewCString| is still right for that case. Other than that, this looks OK. I hesitate to make you attach a whole new patch, when all you have to do is just back out that one file: "nsCookie.cpp". If you do that, then sr=scc
Oops. That one must've slipped through, I thought I verified any callers which weren't PL_* or printf... Yeah, I'll undo that change, nice catch. Yay for review :-) So next up, approval!
Keywords: approval, patch
nominating for mozilla0.9.
Keywords: mozilla0.9
Moving to Tracking component per Asa's suggestion.
Component: Browser-General → Tracking
Removing rtm nomination
Keywords: rtm
Whiteboard: [need info]
r=akkana for the Citer stuff. Note, these printfs aren't normally turned on (it's just there in case someone needs to turn it on temporarily for debugging), but fixing leaks is a good idea anyway.
So today's worries are: - Won't this result in printf-ing garbage sometimes because of UTF8 (>=0x80)? - Won't the wrapper object go out of scope before printf gets to its content? scc?
It seems attachment 17598 [details] [diff] [review] should have the following changes: * nsCookie.cpp changes removed (as scc said above) * obj->NS_ConvertUCS2toUTF8(foo) changed to NS_ConvertUCS2toUTF8(obj->foo) in: + widget/src/mac/nsTextHelper.cpp + widget/src/os2/nsFontRetrieverService.cpp The NS_Convert... object shouldn't go out of scope until after the function call. We rely on that pattern elsewhere, and it's the whole point of using NS_ConvertUCS2toUTF8. nsString::ToNewCString() works through CopyChars2to1 (at least I think so, based on going through the source code), defined at http://lxr.mozilla.org/seamonkey/source/xpcom/ds/bufferRoutines.h#287 , but NS_ConvertUCS2toUTF8 really does conversion to UTF8. Therefore there is the possibility about problems printing characters not intended for printing that didn't exist before. I'm not sure what should be done about this. Perhaps we need NS_ConvertUCS2toASCII that does a lossy conversion similar to the one done by CopyChars2to1 (but maybe even without asserting). However, CopyChars2to1 did assert when it encountered loss in its conversion, so maybe this isn't a problem.
Changes made, nice catch there. dbaron suggested the name NS_LossyConvertUCS2toASCII in #mozilla, I like that even better.
I double-checked in the debugger, and |nsString::ToNewCString| does call |CopyChars2To1|: #0 CopyChars2To1 (aDest=0x806f548 "", anDestOffset=0, aSource=0xbfffe570 "s", anOffset=0, aCount=48) at /builds/seamonkey/mozilla/xpcom/ds/bufferRoutines.h:288 #1 0x400e20f1 in nsStr::StrAppend (aDest=@0xbfffe49c, aSource=@0xbfffe418, anOffset=0, aCount=48) at /builds/seamonkey/mozilla/xpcom/ds/nsStr.cpp:185 #2 0x400e4799 in nsCString::AssignWithConversion (this=0xbfffe498, aString=0xbfffe570, aCount=48) at /builds/seamonkey/mozilla/xpcom/ds/nsString.cpp:796 #3 0x400e4803 in nsCString::AssignWithConversion (this=0xbfffe498, aString=@0xbfffe558) at /builds/seamonkey/mozilla/xpcom/ds/nsString.cpp:801 #4 0x400e7478 in nsString::ToNewCString (this=0xbfffe558) at /builds/seamonkey/mozilla/xpcom/ds/nsString2.cpp:614
So a pattern I'm often seeing is: { nsCAutoString temp; temp.AssignWithConversion(foo); printf("bla %s bla\n", temp.get()); } (or (const char*)temp or NS_STATIC_CAST(const char*, temp), and not only for printf). This can't just be replaced with printf("bla %s bla\n", NS_ConvertUCS2toUTF8(foo).get()); since the conversion is different. (Lossy)ConvertUCS2toASCII could be of help here too. Another thing I'm seeing is that NS_ConvertUCS2toUTF8 is used a lot just to go from (const) PRUnichar* to (const) char*, without there really being any indication a UTF8 string was expected to be passed in. </topic-drift>
jag: the only portable way to use %s is to give it ASCII, even if that means chopping the high 9 bits. Does NS_ConvertUCS2toASCII chop high 8, or high 9? In any case, people who are making ASCII *or* UTF-8 strings just to satisfy const char * in-param types, without the encoding being an explicit part of the interface's contract, are creating dataloss, or rather corruption, bugs. File 'em if you can. If most of these are for debugging message formatting, then no worries, although someone reassure me that NS_ConvertUCS2toASCII chops to 7 bits. /be
NS_ConvertUCS2toASCII doesn't exist. nsCString::AssignWithConversion does chop to 8 bits. So to copy that do we want NS_LossyConvertUCS2toISO8859-1 or NS_LossyConverteUCS2toASCII?
I wouldn't add these lossy converters at all, but their lack seems to have left people abusing NS_ConvertUCS2toUTF8 -- maybe that's the lesser evil? We need to audit all such calls and make sure they're supplying data to interfaces spec'd and impl'd to accept UTF-8, or they're debugging crutches (which I suppose we can tolerate, even if %s would barf on 8'th-bit-set chars). /be
Or they're using Assign- and AppendWithConversion, which replaces unicode chars U+0100 - U+FFFF with a '.'. More data loss. Do we have something like PRUTF8 so we can indicate in a method's signature we really mean UTF8, not char?
Assignee: jag → jaggernaut
Status: ASSIGNED → NEW
-> mozilla1.0
Keywords: correctness
Target Milestone: --- → mozilla1.0
Blocks: 92580
No longer blocks: 92580
Well, NS_LossyConvertUCS2toASCII (well, really ALMOST-ISO-LATIN-1) exists now.
-> 1.1
Target Milestone: mozilla1.0 → mozilla1.1
foo.ToNewCString() doesn't exist anymore.
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → INVALID
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: