Last Comment Bug 529371 - crash [@ nsWeakFrame::Init(nsIFrame*) ] [@ nsWeakFrame::InitExternal(nsIFrame*) ] checking for extension updates
: crash [@ nsWeakFrame::Init(nsIFrame*) ] [@ nsWeakFrame::InitExternal(nsIFra...
Status: VERIFIED FIXED
: fixed1.9.0.19, regression, verified1.9.2
Product: Core
Classification: Components
Component: Disability Access APIs (show other bugs)
: 1.9.2 Branch
: x86 Linux
: -- normal with 1 vote (vote)
: ---
Assigned To: Robert O'Callahan (:roc) (Exited; email my personal email if necessary)
:
Mentors:
http://crash-stats.mozilla.com/report...
: 529397 529834 (view as bug list)
Depends on: 527466 529442 529834
Blocks: 504144 525579 529397
  Show dependency treegraph
 
Reported: 2009-11-17 13:26 PST by aja+bugzilla
Modified: 2014-07-21 10:39 PDT (History)
13 users (show)
mbeltzner: blocking1.9.2+
dveditz: blocking1.9.0.19+
dveditz: wanted1.9.0.x+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
beta4-fixed
.8-fixed


Attachments
patch 1 (3.53 KB, patch)
2009-11-17 16:42 PST, David Bolter [:davidb]
dbaron: review-
Details | Diff | Review
fix (1.14 KB, patch)
2009-11-20 03:09 PST, Robert O'Callahan (:roc) (Exited; email my personal email if necessary)
bzbarsky: review+
Details | Diff | Review
1.9.1 patch (1.29 KB, patch)
2010-01-26 12:27 PST, Robert O'Callahan (:roc) (Exited; email my personal email if necessary)
dveditz: approval1.9.1.8+
mbeltzner: approval1.9.0.19+
Details | Diff | Review

Description aja+bugzilla 2009-11-17 13:26:56 PST
1.9.2 nitely crashes during check for extension updates
Comment 2 David Bolter [:davidb] 2009-11-17 14:29:44 PST
Marco, can you recreate?
Comment 3 David Bolter [:davidb] 2009-11-17 16:42:36 PST
Created attachment 412976 [details] [diff] [review]
patch 1

David Baron, I was suspicious of UpdateFrame's use of mState.frame (which could be nulled in GetKids). Thoughts on this patch?
Comment 4 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2009-11-17 16:51:27 PST
Comment on attachment 412976 [details] [diff] [review]
patch 1

The only reason you can check mState.frame as a boolean is that it has an operator nsIFrame*.  That operator does the exact same thing as GetFrame(), so you're just returning GetFrame() ? GetFrame() : nsnull, which is the same as GetFrame().
Comment 5 David Bolter [:davidb] 2009-11-17 16:55:25 PST
Wow that's cool.
Comment 6 David Bolter [:davidb] 2009-11-17 17:08:02 PST
OK I assume mState.frame = nsnull matches [nsWeakFrame& operator=(nsIFrame* aFrame)] (and returns an nsWeakShell with a null frame)
Comment 7 David Bolter [:davidb] 2009-11-17 17:21:26 PST
Roc, any ideas/hunches?
http://crash-stats.mozilla.com/report/index/6457f9c6-e61a-40d6-bcbe-30fa52091117
http://crash-stats.mozilla.com/report/index/bp-97367631-cfe7-42cd-96fd-140552091117
(dbaron points out these might be the same crash given that linux central might use a different compiler)
Comment 8 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2009-11-17 17:57:10 PST
I don't know. If it shows up on Windows, a minidump would probably be helpful.
Comment 9 David Bolter [:davidb] 2009-11-17 18:05:51 PST
Roc, on windows seems to appear as bug 529442.
Comment 10 aja+bugzilla 2009-11-18 13:52:22 PST
Linux version of tryserver build mentioned here
https://bugzilla.mozilla.org/show_bug.cgi?id=529442#c30
still crashes.
Comment 11 chris hofmann 2009-11-18 19:57:36 PST
the 

the nsWeakFrame::InitExternal has only the one crash report but nsWeakFrame::Init  is higher volume.  it was in single digits for aug, sept, oct, but began growing in nov. and spiked yesterday.

[single digits before 10/29 and release of 3.6b1]
11   total crashes for nsWeakFrame::Init on 20091029-crashdata.csv
15   total crashes for nsWeakFrame::Init on 20091101-crashdata.csv
13   total crashes for nsWeakFrame::Init on 20091102-crashdata.csv
16   total crashes for nsWeakFrame::Init on 20091103-crashdata.csv
13   total crashes for nsWeakFrame::Init on 20091104-crashdata.csv
14   total crashes for nsWeakFrame::Init on 20091105-crashdata.csv
23   total crashes for nsWeakFrame::Init on 20091106-crashdata.csv
15   total crashes for nsWeakFrame::Init on 20091107-crashdata.csv
19   total crashes for nsWeakFrame::Init on 20091108-crashdata.csv
18   total crashes for nsWeakFrame::Init on 20091109-crashdata.csv
27   total crashes for nsWeakFrame::Init on 20091110-crashdata.csv
26   total crashes for nsWeakFrame::Init on 20091111-crashdata.csv
12   total crashes for nsWeakFrame::Init on 20091112-crashdata.csv
15   total crashes for nsWeakFrame::Init on 20091113-crashdata.csv
13   total crashes for nsWeakFrame::Init on 20091114-crashdata.csv
17   total crashes for nsWeakFrame::Init on 20091115-crashdata.csv
15   total crashes for nsWeakFrame::Init on 20091116-crashdata.csv
48   total crashes for nsWeakFrame::Init on 20091117-crashdata.csv

distribution of all versions where the nsWeakFrame::Init crash was found on 20091117-crashdata.csv
  22 Firefox 3.7a1pre
  11 Firefox 3.6b4pre
  10 Firefox 3.5.5
   3 Firefox 3.6b2
   1 Firefox 3.1b3
   1 Firefox 3.0.15
Comment 12 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2009-11-18 20:00:43 PST
That's expected. The fix for bug 525579 moved a crash into there. Hopefully the fix for bug 529442 will remove it altogether.
Comment 13 David Bolter [:davidb] 2009-11-19 07:09:59 PST
Should we worry preventing a nsWeakFrame(aPoisonedFrame) crash (in Init*)?
Comment 15 David Bolter [:davidb] 2009-11-19 07:20:07 PST
Is nsWeakFrame::GetFrame guaranteed to return nsnull for a dead frame?
Comment 16 David Bolter [:davidb] 2009-11-19 07:51:07 PST
A fresh windows stack from MarcoZ:
http://crash-stats.mozilla.com/report/index/bp-6e723c79-9a9d-4b3d-93e5-728442091119
Comment 17 alexander :surkov 2009-11-19 07:53:20 PST
I wonder how curFrame->GetNextSibling() can return bad frame?
Comment 18 David Bolter [:davidb] 2009-11-19 07:57:34 PST
I spun off security bug 529834 as a dependency.
Comment 19 David Bolter [:davidb] 2009-11-19 07:58:36 PST
(In reply to comment #17)
> I wonder how curFrame->GetNextSibling() can return bad frame?

Yes, I guess it is possible that a frame doesn't know if its sibling is dead.
Comment 20 David Bolter [:davidb] 2009-11-19 07:59:11 PST
Possibly we'll need sibling deallocation to be atomic?
Comment 21 aja+bugzilla 2009-11-19 08:13:42 PST
(In reply to comment #14)
> Fresh Linux crash stacks from aja:
> http://crash-stats.mozilla.com/report/index/13e5ce2e-2108-47cb-b5ad-72ce22091119
> http://crash-stats.mozilla.com/report/index/2b361537-dc10-47ad-a316-c4e862091119
another....without any of the initial extension update (for new firebug release) dialog stuff the other two had:
http://crash-stats.mozilla.com/report/index/a96616e6-43ae-4c73-a997-51f1e2091119
Comment 22 Mike Beltzner [:beltzner, not reading bugmail] 2009-11-19 20:41:37 PST
Bolter: who's gonna own this?
Comment 23 David Bolter [:davidb] 2009-11-19 20:49:57 PST
*** Bug 529397 has been marked as a duplicate of this bug. ***
Comment 24 David Bolter [:davidb] 2009-11-19 21:14:16 PST
(In reply to comment #22)
> Bolter: who's gonna own this?

I'll stick my neck out for now.
Comment 25 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2009-11-20 02:46:40 PST
OK, it looks like this really is a bad frame tree. I see that the frame tree (for the Addons window, I think) contains an nsMenuPopupFrame whose mNextSibling is pointing to a poisoned frame.
Comment 26 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2009-11-20 02:58:09 PST
OK, this is all about nsPopupSetFrame. It keeps its child nsMenuPopupFrames in a special data structure that's not a regular frame list. When frames are added to or removed from this data structure, their mNextSibling/mPrevSibling pointers aren't updated.
Comment 27 alexander :surkov 2009-11-20 03:00:44 PST
Great work, Robert! Can you fix it? :)
Comment 28 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2009-11-20 03:09:34 PST
Created attachment 413578 [details] [diff] [review]
fix

Arguably accessibility shouldn't be looking at the children of the nsPopupSetFrame. But it's definitely unwise for nsPopupSetFrame to let its frame children have bogus prev/next sibling pointers.

I don't know why nsPopupSetFrame doesn't just keep its popup children in a regular nsFrameList. Something to ask Neil when he gets back.
Comment 29 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2009-11-20 03:10:07 PST
Neil, see comment #28...
Comment 30 Marco Zehe (:MarcoZ) 2009-11-20 03:36:34 PST
(In reply to comment #28)
> Arguably accessibility shouldn't be looking at the children of the
> nsPopupSetFrame.

If you have a better suggestion how we can get to the same info, we're all ears!
Comment 31 alexander :surkov 2009-11-20 04:10:07 PST
(In reply to comment #30)
> (In reply to comment #28)
> > Arguably accessibility shouldn't be looking at the children of the
> > nsPopupSetFrame.
> 
> If you have a better suggestion how we can get to the same info, we're all
> ears!

We just need to traverse the DOM in this case like I did in bug 527466.
Comment 32 aja+bugzilla 2009-11-20 07:27:06 PST
Tryserver build referenced at https://bugzilla.mozilla.org/show_bug.cgi?id=527466#c11 seems to fix this on m-c trunk linux.
Comment 33 David Bolter [:davidb] 2009-11-20 07:43:29 PST
Thanks aja, I think we want to take that patch and certainly Roc's.
Comment 34 Boris Zbarsky [:bz] 2009-11-20 08:46:51 PST
Comment on attachment 413578 [details] [diff] [review]
fix

r+.

Want to call this a fix for bug 504144?
Comment 35 David Bolter [:davidb] 2009-11-20 11:54:12 PST
Pushed Roc's fix on his behalf to central:
http://hg.mozilla.org/mozilla-central/rev/4b8962aad902
Comment 36 David Bolter [:davidb] 2009-11-20 12:44:45 PST
Landed on 192: http://hg.mozilla.org/releases/mozilla-1.9.2/rev/490f5eb3f100

Thanks Roc. Closing as fixed 192
Comment 37 David Bolter [:davidb] 2009-11-20 12:49:37 PST
*** Bug 529834 has been marked as a duplicate of this bug. ***
Comment 38 Marco Zehe (:MarcoZ) 2009-11-22 00:32:05 PST
Confirmed fixed on Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.3a1pre) Gecko/20091121 Minefield/3.7a1pre (.NET CLR 3.5.30729). Also the delay I was seeing in the try-server builds before this patch landed is gone again, the Add-Ons manager comes up instantly. Leaving on "RESOLVED" until Aja confirms the fix on Linux.
Comment 39 Marco Zehe (:MarcoZ) 2009-11-22 00:37:18 PST
Verified fixed on Windows in Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2b4pre) Gecko/20091121 Namoroka/3.6b4pre (.NET CLR 3.5.30729). Also I don't see the delay any longer that I saw with try-server builds of Namoroka after the fix of bug 529442, but before this fix got in. All is well from my end now, but I'll leave the verified1.9.2 keyword out until Aja can confirm that the problem is also fixed for him on Linux.
Comment 40 aja+bugzilla 2009-11-22 17:55:08 PST
verified1.9.2 with linux nightly build Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2b4pre) Gecko/20091122 Namoroka/3.6b4pre ID:20091122033700
Comment 41 Daniel Veditz [:dveditz] 2009-11-22 19:35:44 PST
Is this a good fix to take on older branches?
Comment 42 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2009-11-23 02:45:17 PST
Yes.
Comment 43 aja+bugzilla 2009-11-24 22:01:30 PST
(In reply to comment #40)
> verified1.9.2 with linux nightly build Mozilla/5.0 (X11; U; Linux i686; en-US;
> rv:1.9.2b4pre) Gecko/20091122 Namoroka/3.6b4pre ID:20091122033700

FWIW, working okay on linux 3.6b4 build1, too.
Comment 44 Marco Zehe (:MarcoZ) 2009-11-24 22:48:58 PST
Setting to verified based on recent comments. Thanks everyone!
Comment 45 Neil Deakin 2009-12-03 12:39:45 PST
(In reply to comment #28)

> I don't know why nsPopupSetFrame doesn't just keep its popup children in a
> regular nsFrameList. Something to ask Neil when he gets back.

Probably to avoid having to override all the various layout methods, and for the same reasons that positioned elements use a separate list.
Comment 46 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2009-12-03 13:29:04 PST
I mean, keep the frames out of the main child list (i.e. GetChildList(nsnull) returns an empty list), in nsGkAtoms::popupList like now ... but instead of storing the frames in a custom linked list structure, just use an nsFrameList.
Comment 47 Neil Deakin 2009-12-07 12:50:14 PST
No, I see no reason for the special framelist.
Comment 48 Daniel Veditz [:dveditz] 2010-01-12 15:18:17 PST
Don't know why I marked this one fixed on the 1.9.1 branch, perhaps confusion with the blocking flag. The patch in the bug definitely has not been checked into the 1.9.1 branch (not sure if it's correct for the branch as-is) and crash-stats still shows crashes as nsWeakFrame::Init(nsIFrame*) in 3.5.x
Comment 49 Daniel Veditz [:dveditz] 2010-01-26 10:59:44 PST
Roc: please get someone to work up a 1.9.1 version of this patch (does this one apply correctly? seems simple enough to), we'd like to take this.
Comment 50 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2010-01-26 12:27:02 PST
Created attachment 423598 [details] [diff] [review]
1.9.1 patch
Comment 51 Daniel Veditz [:dveditz] 2010-02-01 10:54:28 PST
Comment on attachment 423598 [details] [diff] [review]
1.9.1 patch

Approved for 1.9.1.8, a=dveditz for release-drivers
Comment 52 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2010-02-02 13:08:59 PST
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/6877694882ab
Comment 53 Mike Beltzner [:beltzner, not reading bugmail] 2010-03-03 22:00:54 PST
Roc: any chance of a 1.9.0.19 patch?
Comment 54 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2010-03-04 12:48:48 PST
I'm pretty sure the 1.9.1 patch will apply to 1.9.0 as-is.
Comment 55 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2010-03-04 13:50:47 PST
It does apply cleanly.
Comment 56 Mike Beltzner [:beltzner, not reading bugmail] 2010-03-05 13:37:01 PST
Comment on attachment 423598 [details] [diff] [review]
1.9.1 patch

a=beltzner for 1.9.0.19
Comment 58 Markus Popp 2014-07-21 10:39:21 PDT
This crash (despite said to be fixed for a very long time) is referenced in this crash report of July 16, 2014:

https://crash-stats.mozilla.com/report/index/9b59be92-91ee-49f3-9f30-d26a62140716

Note You need to log in before you can comment on or make changes to this bug.