Closed Bug 80722 Opened 23 years ago Closed 23 years ago

crash creatng new profile

Categories

(Core Graveyard :: Profile: BackEnd, defect)

PowerPC
Mac System 9.x
defect
Not set
blocker

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: tracy, Assigned: sfraser_bugs)

References

Details

(Keywords: crash, smoketest)

Attachments

(5 files)

seen on mac commercial build 200-10-14-04-trunk

-open profile manager
-select manage profiles
-select create new profile
-click next in the window that appears

crash everytime.

stack trace to follow
Keywords: smoketest
Style in the stack, I'm looking...
Keywords: crash
Since I created a number of dupes this morning I'll comment on this one.  This 
does not appear to be related to 80746.  The only events showing up in this are 
a 'click' event and a 'load' event followed by the crash in style.  The event 
problem in 80746 stemmed from 'change' events.  However, anyone looking to test 
this theory can update mozilla/layout/html/forms/src to pick up my patch.
This might be my fault.  Try the patches in 80775 and 80748 and see if that helps.
Joe, can you test it on Mac? For me it doesn't fix the problem but my stack trace is different, maybe I have another 
problem!
the patches joe pasted are mac classic only. 

Tracy, were you using modern or classic when you crash on a new profile for the mac?
modern skin.
If this is crashing on commercial, then my patches won't help at all since the
commercial profile manager always uses Modern
I am crashing using a Mozilla debug build. The crash occurs so early the the new profile doesn't have time to be 
created!
Can we try this with no existing profiles? I was told that profiles created with 
a bad build (circa last Thursday) could cause this...
I tried after cleaning out old profiles and Mozilla registry etc.

I crash as soon as I press [next] button on first page of Create Profile Wizard
I just tried this with both a opt seamonkey build Friday and one a debug one 
that I build yeterday. All worked in creating a new profile.

Then...I deleted any Netscape or Mozilla prefs in the prefs folder AND deleted 
my Mozilla folder under Documents (all my profiles and registry). Started up 
Debug and I get all these ASSERTIONS in XBL binding (file XBLService.cpp line 
979) but it does let me through but we seem unstable.
for the record, I just landed fixes to all the missing mac classic XBL bindings,
so that should cure your assertion woes.
I'm seeing heap corruption while in the Profile Wizard (double free). Someone 
should run Purify over this.
This looks to me like XPCDOM/xpconnect badness. cc some folks.
Last few xpconnect method calls before the double free:

>>>>>>>> 300 calls on XPCWrappedNatives made.  (442)
nsIXPCComponents::classes 2 ( js->c ) 
nsIJSCID::createInstance 1 ( js->c ) 
nsILocalFile::initWithUnicodePath 0 ( js->c ) 
nsIDOMHTMLInputElement::value 1 ( js->c ) 
nsILocalFile::appendUnicode 0 ( js->c ) 
nsIDOMWindow::document 0 ( js->c ) 
nsIDOMXULDocument::getElementById 0 ( js->c ) 
nsIDOMXULElement::hasChildNodes 0 ( js->c ) 
nsIDOMWindow::document 0 ( js->c ) 
<--------double free occurs after this call.
The logging is printed *after* the call is complete. This is probably crashing 
in the cleanup on the call *after* the last one listed.
My NT build just finished. I'll try to investigate further.
Maybe unrelated:

Isn't this bad?

    PRBool IsPtrData()      const {return (PRBool) (flags & PTR_IS_DATA);}
    PRBool IsValAllocated() const {return (PRBool) (flags & VAL_IS_ALLOCD);}
    PRBool IsValInterface() const {return (PRBool) (flags & VAL_IS_IFACE);}
    PRBool IsValArray()     const {return (PRBool) (flags & VAL_IS_ARRAY);}
    PRBool IsValDOMString() const {return (PRBool) (flags & VAL_IS_DOMSTR);}

You'll return PRBools here that have bits set other than bit 0, which is nasty.
The cast to a PRBool won't cause any bit-shifting, methinks.
So the call that's doing bad things goes to

NS_IMETHODIMP  
nsLocalFile::GetUnicodePath(PRUnichar **_retval)
{
   GET_UCS(GetPath, _retval);
}

and the nsFSStringConversion::FSToNewUCS() call fails. However, because of a bug, 
it fails to set the out param to null. Patch coming for that...
We're hitting that failure code because a CreateInstance of a 
kUnicodeDecodeHelperCID fails.
...and the reason for that is some carelessness by kandrot. patch coming.
Looks like GET_UCS leaks tmp if nsFSStringConversion::FSToNewUCS fails.  Should
use an nsXPIDLCString tmp, eh?

/be
I have this nailed.
Assignee: ccarlen → sfraser
For what it's worth, kandrot didn't make those changes, he only checked them in
for a consultant that doesn't have cvs access, kandrot is not the one to blamed
for this.
sfraser: why do out params have to be nulled (*aOut = nsnull) in cases where a
failure code (res failed NS_SUCCEEDED(res) testing just above) is being
returned?

/be
r/sr=jband on both of smfr's patches
be: the out param was being left uninitialized, though an error was being 
returned. Maybe this exposes an xpconnect bug?
simon please go ahead and check your fix in now. I'll open the tree once you
check in. Oh wait I see pinkerton just filed a new blocker so I won't be opening
the tree. But please check this in. Thanks for stepping up and working on this. 
brendan: I think we generally agreed that the caller is responsible for cleanup 
of non-null'd 'out' params in failure cases. xpconnect assumes this. We wanted 
to not make callees have 10 million cleanup paths. No?
smfr: the out param was free'd but not null'd. right?
I checked the fixes in.

I also verified that xpconnect does indeed try to PR_Free the out param (which 
happened to contain the address of an already-freed pointer), even 
when an error is returned.

I'm gonna mark this fixed to open the tree. We need new bugs on the PRBool/flags  
issue, and maybe one on xpconnect freeing uninitialized out params.
Fixed for real.
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
r=jband's patches for the PRBool/flags badness.
> Looks like GET_UCS leaks tmp if nsFSStringConversion::FSToNewUCS fails.  Should
> use an nsXPIDLCString tmp, eh?

Right. Want another bug?
jband: the call_ee_ is responsible for cleaning up (freeing memory) allocated
optimistically to out params, before the callee fails.  The caller can assume
nothing about out param results of pointer type if NS_FAILED(rv).  I'm confused.

sfraser: those *aOut = nsnull set storage already initialized to point at newly
malloc'd memory that was then freed just above the (post-patch) *aOut = nsnull;
statements.  So the out params were initialized, but no memory was leaked, and
the dangling pointers returned to callers should not have mattered, because of
the NS_FAILED return value.  Or so the rules of XPCOM as I understand them say!

/be
Ok, Simon verified that XPConnect is freeing out params after failure.  I don't
think that is the right thing to do (someone quote Box at me again).  Jband?

/be
brendan: I dug out this:
news://news.mozilla.org/3727F198.98D96434%40netscape.com
I know we discussed this more than once. xpconnect cleans up params 
(in,out,and inout) regardless of call success.
sr=jst for jband's patches.
jband: the posting you cite (thanks for finding that) ends with you agreeing
with me, so *why* does XPConnect free out params in the caller irrespective of
call failure?

I thought I was victorious!

QI is special, but we are not talking about it.  We're dealing with string out
params.  XPConnect is risking a double free and crash, where if it did what I
think is the right thing, the worst that might happen, *due to a very real bug
in the callee's impl*, would be a leak.

Want a new bug?

/be
brendan: That posting was part of a thread and not the only time this was 
discussed. We can discuss this in the newsgroups again if you want. But I'll 
argue that the current behavior in xpconnect is correct. If the callee didn't 
set the out pointer there'd be nothing to clean up. If the callee is then going 
to free that out pointer it should null the pointer too to indicate that state 
unambiguously. I was pointing out that *other* generic layers like xpconnect 
make the same assumption xpconnect is making. I think things get worse, not 
better, if you change this. The bug here was code that was *thying* to follow 
this rule but was simply written incorrectly.
Sorry about being "careless".  I volunteered to help out Ron Guilmette, who does
not have CVS access. After his code had been r= and sr= 'ed, it was given to me
to checked in for them, since they had too many other things on their plate.

I will try not to do this again.
jband: I will raise this issue in m.xpcom; I thought we had decided to leave the
burden of cleanup on the callee in case of callee failure, which is ambiguous in
unsafe ways for callers who want to clean up too.  I repeat that the failure
mode is worse if callers try (crash) than if they don't (leak).  Plus the code
bloat if callers are numerous w.r.t callees (they generally are; besides, most
of our hand-crafted callers *do not* clean up out params that are possibly set
or dangling -- only the generic callers such as xpconnect and xpcom/proxy do --
so things are inconsistent besides being unsafe).

Ok, I'm done rehearsing arguments.  Off to m.xpcom.

r/sr=brendan@mozilla.org on jband's xpconnect boolean tidying.

/be
*** Bug 80752 has been marked as a duplicate of this bug. ***
verified build 2001051504
Status: RESOLVED → VERIFIED
I take it back, and will not waste time in m.xpcom.  nsXPIDL[C]String's
getter_Copies means that out params must be null or else point to a copy whose
ownership has been transferred to the caller, whether or not the method call
bearing the getter_Copies actual argument expression fails.

The same situation exists for interface pointers where the caller passes a
getter_AddRefs(comPtr).

My brain must have been stuck in the pre-nsXPIDLString past.  Sorry about that.

/be
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: