crash creatng new profile

VERIFIED FIXED

Status

Core Graveyard
Profile: BackEnd
--
blocker
VERIFIED FIXED
17 years ago
2 years ago

People

(Reporter: tracy, Assigned: Simon Fraser)

Tracking

({crash, smoketest})

Trunk
PowerPC
Mac System 9.x
crash, smoketest

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(5 attachments)

(Reporter)

Description

17 years ago
seen on mac commercial build 200-10-14-04-trunk

-open profile manager
-select manage profiles
-select create new profile
-click next in the window that appears

crash everytime.

stack trace to follow
(Reporter)

Comment 1

17 years ago
Created attachment 34364 [details] [diff] [review]
stack trace of create profile crash
(Reporter)

Updated

17 years ago
Keywords: smoketest

Comment 2

17 years ago
Style in the stack, I'm looking...

Updated

17 years ago
Keywords: crash

Comment 3

17 years ago
Since I created a number of dupes this morning I'll comment on this one.  This 
does not appear to be related to 80746.  The only events showing up in this are 
a 'click' event and a 'load' event followed by the crash in style.  The event 
problem in 80746 stemmed from 'change' events.  However, anyone looking to test 
this theory can update mozilla/layout/html/forms/src to pick up my patch.

Comment 4

17 years ago
This might be my fault.  Try the patches in 80775 and 80748 and see if that helps.
Joe, can you test it on Mac? For me it doesn't fix the problem but my stack trace is different, maybe I have another 
problem!

Comment 6

17 years ago
the patches joe pasted are mac classic only. 

Tracy, were you using modern or classic when you crash on a new profile for the mac?
(Reporter)

Comment 7

17 years ago
modern skin.

Comment 8

17 years ago
If this is crashing on commercial, then my patches won't help at all since the
commercial profile manager always uses Modern
I am crashing using a Mozilla debug build. The crash occurs so early the the new profile doesn't have time to be 
created!

Comment 10

17 years ago
Can we try this with no existing profiles? I was told that profiles created with 
a bad build (circa last Thursday) could cause this...

Comment 11

17 years ago
I tried after cleaning out old profiles and Mozilla registry etc.

I crash as soon as I press [next] button on first page of Create Profile Wizard

Comment 12

17 years ago
I just tried this with both a opt seamonkey build Friday and one a debug one 
that I build yeterday. All worked in creating a new profile.

Then...I deleted any Netscape or Mozilla prefs in the prefs folder AND deleted 
my Mozilla folder under Documents (all my profiles and registry). Started up 
Debug and I get all these ASSERTIONS in XBL binding (file XBLService.cpp line 
979) but it does let me through but we seem unstable.

Comment 13

17 years ago
for the record, I just landed fixes to all the missing mac classic XBL bindings,
so that should cure your assertion woes.
(Assignee)

Comment 14

17 years ago
I'm seeing heap corruption while in the Profile Wizard (double free). Someone 
should run Purify over this.
(Assignee)

Comment 15

17 years ago
Created attachment 34427 [details]
stack for the double free
(Assignee)

Comment 16

17 years ago
This looks to me like XPCDOM/xpconnect badness. cc some folks.
(Assignee)

Comment 17

17 years ago
Last few xpconnect method calls before the double free:

>>>>>>>> 300 calls on XPCWrappedNatives made.  (442)
nsIXPCComponents::classes 2 ( js->c ) 
nsIJSCID::createInstance 1 ( js->c ) 
nsILocalFile::initWithUnicodePath 0 ( js->c ) 
nsIDOMHTMLInputElement::value 1 ( js->c ) 
nsILocalFile::appendUnicode 0 ( js->c ) 
nsIDOMWindow::document 0 ( js->c ) 
nsIDOMXULDocument::getElementById 0 ( js->c ) 
nsIDOMXULElement::hasChildNodes 0 ( js->c ) 
nsIDOMWindow::document 0 ( js->c ) 
<--------double free occurs after this call.

Comment 18

17 years ago
The logging is printed *after* the call is complete. This is probably crashing 
in the cleanup on the call *after* the last one listed.

Comment 19

17 years ago
My NT build just finished. I'll try to investigate further.
(Assignee)

Comment 20

17 years ago
Maybe unrelated:

Isn't this bad?

    PRBool IsPtrData()      const {return (PRBool) (flags & PTR_IS_DATA);}
    PRBool IsValAllocated() const {return (PRBool) (flags & VAL_IS_ALLOCD);}
    PRBool IsValInterface() const {return (PRBool) (flags & VAL_IS_IFACE);}
    PRBool IsValArray()     const {return (PRBool) (flags & VAL_IS_ARRAY);}
    PRBool IsValDOMString() const {return (PRBool) (flags & VAL_IS_DOMSTR);}

You'll return PRBools here that have bits set other than bit 0, which is nasty.
The cast to a PRBool won't cause any bit-shifting, methinks.
(Assignee)

Comment 21

17 years ago
So the call that's doing bad things goes to

NS_IMETHODIMP  
nsLocalFile::GetUnicodePath(PRUnichar **_retval)
{
   GET_UCS(GetPath, _retval);
}

and the nsFSStringConversion::FSToNewUCS() call fails. However, because of a bug, 
it fails to set the out param to null. Patch coming for that...
(Assignee)

Comment 22

17 years ago
Created attachment 34437 [details] [diff] [review]
nsLocalFileCommon.cp fixes for setting out params to null
(Assignee)

Comment 23

17 years ago
We're hitting that failure code because a CreateInstance of a 
kUnicodeDecodeHelperCID fails.
(Assignee)

Comment 24

17 years ago
...and the reason for that is some carelessness by kandrot. patch coming.
(Assignee)

Comment 25

17 years ago
Created attachment 34442 [details] [diff] [review]
Fix encode/decode mangling by kandrot
Looks like GET_UCS leaks tmp if nsFSStringConversion::FSToNewUCS fails.  Should
use an nsXPIDLCString tmp, eh?

/be
(Assignee)

Comment 27

17 years ago
I have this nailed.
Assignee: ccarlen → sfraser

Comment 28

17 years ago
Created attachment 34448 [details] [diff] [review]
fix the non{0,1} PRBool issues (though they are not causing problems really)
For what it's worth, kandrot didn't make those changes, he only checked them in
for a consultant that doesn't have cvs access, kandrot is not the one to blamed
for this.
sfraser: why do out params have to be nulled (*aOut = nsnull) in cases where a
failure code (res failed NS_SUCCEEDED(res) testing just above) is being
returned?

/be

Comment 31

17 years ago
r/sr=jband on both of smfr's patches
(Assignee)

Comment 32

17 years ago
be: the out param was being left uninitialized, though an error was being 
returned. Maybe this exposes an xpconnect bug?

Comment 33

17 years ago
simon please go ahead and check your fix in now. I'll open the tree once you
check in. Oh wait I see pinkerton just filed a new blocker so I won't be opening
the tree. But please check this in. Thanks for stepping up and working on this. 

Comment 34

17 years ago
brendan: I think we generally agreed that the caller is responsible for cleanup 
of non-null'd 'out' params in failure cases. xpconnect assumes this. We wanted 
to not make callees have 10 million cleanup paths. No?

Comment 35

17 years ago
smfr: the out param was free'd but not null'd. right?
(Assignee)

Comment 36

17 years ago
I checked the fixes in.

I also verified that xpconnect does indeed try to PR_Free the out param (which 
happened to contain the address of an already-freed pointer), even 
when an error is returned.

I'm gonna mark this fixed to open the tree. We need new bugs on the PRBool/flags  
issue, and maybe one on xpconnect freeing uninitialized out params.
(Assignee)

Comment 37

17 years ago
Fixed for real.
Status: NEW → RESOLVED
Last Resolved: 17 years ago
Resolution: --- → FIXED
(Assignee)

Comment 38

17 years ago
r=jband's patches for the PRBool/flags badness.
(Assignee)

Comment 39

17 years ago
> Looks like GET_UCS leaks tmp if nsFSStringConversion::FSToNewUCS fails.  Should
> use an nsXPIDLCString tmp, eh?

Right. Want another bug?
jband: the call_ee_ is responsible for cleaning up (freeing memory) allocated
optimistically to out params, before the callee fails.  The caller can assume
nothing about out param results of pointer type if NS_FAILED(rv).  I'm confused.

sfraser: those *aOut = nsnull set storage already initialized to point at newly
malloc'd memory that was then freed just above the (post-patch) *aOut = nsnull;
statements.  So the out params were initialized, but no memory was leaked, and
the dangling pointers returned to callers should not have mattered, because of
the NS_FAILED return value.  Or so the rules of XPCOM as I understand them say!

/be
Ok, Simon verified that XPConnect is freeing out params after failure.  I don't
think that is the right thing to do (someone quote Box at me again).  Jband?

/be

Comment 42

17 years ago
brendan: I dug out this:
news://news.mozilla.org/3727F198.98D96434%40netscape.com
I know we discussed this more than once. xpconnect cleans up params 
(in,out,and inout) regardless of call success.
sr=jst for jband's patches.
jband: the posting you cite (thanks for finding that) ends with you agreeing
with me, so *why* does XPConnect free out params in the caller irrespective of
call failure?

I thought I was victorious!

QI is special, but we are not talking about it.  We're dealing with string out
params.  XPConnect is risking a double free and crash, where if it did what I
think is the right thing, the worst that might happen, *due to a very real bug
in the callee's impl*, would be a leak.

Want a new bug?

/be

Comment 45

17 years ago
brendan: That posting was part of a thread and not the only time this was 
discussed. We can discuss this in the newsgroups again if you want. But I'll 
argue that the current behavior in xpconnect is correct. If the callee didn't 
set the out pointer there'd be nothing to clean up. If the callee is then going 
to free that out pointer it should null the pointer too to indicate that state 
unambiguously. I was pointing out that *other* generic layers like xpconnect 
make the same assumption xpconnect is making. I think things get worse, not 
better, if you change this. The bug here was code that was *thying* to follow 
this rule but was simply written incorrectly.

Comment 46

17 years ago
Sorry about being "careless".  I volunteered to help out Ron Guilmette, who does
not have CVS access. After his code had been r= and sr= 'ed, it was given to me
to checked in for them, since they had too many other things on their plate.

I will try not to do this again.
jband: I will raise this issue in m.xpcom; I thought we had decided to leave the
burden of cleanup on the callee in case of callee failure, which is ambiguous in
unsafe ways for callers who want to clean up too.  I repeat that the failure
mode is worse if callers try (crash) than if they don't (leak).  Plus the code
bloat if callers are numerous w.r.t callees (they generally are; besides, most
of our hand-crafted callers *do not* clean up out params that are possibly set
or dangling -- only the generic callers such as xpconnect and xpcom/proxy do --
so things are inconsistent besides being unsafe).

Ok, I'm done rehearsing arguments.  Off to m.xpcom.

r/sr=brendan@mozilla.org on jband's xpconnect boolean tidying.

/be
*** Bug 80752 has been marked as a duplicate of this bug. ***

Comment 49

17 years ago
verified build 2001051504
Status: RESOLVED → VERIFIED
I take it back, and will not waste time in m.xpcom.  nsXPIDL[C]String's
getter_Copies means that out params must be null or else point to a copy whose
ownership has been transferred to the caller, whether or not the method call
bearing the getter_Copies actual argument expression fails.

The same situation exists for interface pointers where the caller passes a
getter_AddRefs(comPtr).

My brain must have been stuck in the pre-nsXPIDLString past.  Sorry about that.

/be
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.