Closed Bug 529371 Opened 10 years ago Closed 10 years ago

crash [@ nsWeakFrame::Init(nsIFrame*) ] [@ nsWeakFrame::InitExternal(nsIFrame*) ] checking for extension updates

Categories

(Core :: Disability Access APIs, defect)

1.9.2 Branch
x86
Linux
defect
Not set

Tracking

()

VERIFIED FIXED
Tracking Status
status1.9.2 --- beta4-fixed
status1.9.1 --- .8-fixed

People

(Reporter: aja+bugzilla, Assigned: roc)

References

()

Details

(Keywords: fixed1.9.0.19, regression, verified1.9.2)

Attachments

(3 files)

1.9.2 nitely crashes during check for extension updates
Flags: blocking1.9.2?
Keywords: regression
Version: unspecified → 1.9.2 Branch
Blocks: 525579
Summary: [@nsWeakFrame::InitExternal(nsIFrame*) ] → crash [@nsWeakFrame::InitExternal(nsIFrame*) ] checking for extension updates
Marco, can you recreate?
Attached patch patch 1Splinter Review
David Baron, I was suspicious of UpdateFrame's use of mState.frame (which could be nulled in GetKids). Thoughts on this patch?
Assignee: nobody → bolterbugz
Status: NEW → ASSIGNED
Attachment #412976 - Flags: review?
Attachment #412976 - Flags: review?(surkov.alexander)
Attachment #412976 - Flags: review?(dbaron)
Attachment #412976 - Flags: review?
Comment on attachment 412976 [details] [diff] [review]
patch 1

The only reason you can check mState.frame as a boolean is that it has an operator nsIFrame*.  That operator does the exact same thing as GetFrame(), so you're just returning GetFrame() ? GetFrame() : nsnull, which is the same as GetFrame().
Attachment #412976 - Flags: review?(dbaron) → review-
Attachment #412976 - Flags: review?(surkov.alexander)
OK I assume mState.frame = nsnull matches [nsWeakFrame& operator=(nsIFrame* aFrame)] (and returns an nsWeakShell with a null frame)
Roc, any ideas/hunches?
http://crash-stats.mozilla.com/report/index/6457f9c6-e61a-40d6-bcbe-30fa52091117
http://crash-stats.mozilla.com/report/index/bp-97367631-cfe7-42cd-96fd-140552091117
(dbaron points out these might be the same crash given that linux central might use a different compiler)
I don't know. If it shows up on Windows, a minidump would probably be helpful.
Roc, on windows seems to appear as bug 529442.
Assignee: bolterbugz → nobody
Linux version of tryserver build mentioned here
https://bugzilla.mozilla.org/show_bug.cgi?id=529442#c30
still crashes.
the 

the nsWeakFrame::InitExternal has only the one crash report but nsWeakFrame::Init  is higher volume.  it was in single digits for aug, sept, oct, but began growing in nov. and spiked yesterday.

[single digits before 10/29 and release of 3.6b1]
11   total crashes for nsWeakFrame::Init on 20091029-crashdata.csv
15   total crashes for nsWeakFrame::Init on 20091101-crashdata.csv
13   total crashes for nsWeakFrame::Init on 20091102-crashdata.csv
16   total crashes for nsWeakFrame::Init on 20091103-crashdata.csv
13   total crashes for nsWeakFrame::Init on 20091104-crashdata.csv
14   total crashes for nsWeakFrame::Init on 20091105-crashdata.csv
23   total crashes for nsWeakFrame::Init on 20091106-crashdata.csv
15   total crashes for nsWeakFrame::Init on 20091107-crashdata.csv
19   total crashes for nsWeakFrame::Init on 20091108-crashdata.csv
18   total crashes for nsWeakFrame::Init on 20091109-crashdata.csv
27   total crashes for nsWeakFrame::Init on 20091110-crashdata.csv
26   total crashes for nsWeakFrame::Init on 20091111-crashdata.csv
12   total crashes for nsWeakFrame::Init on 20091112-crashdata.csv
15   total crashes for nsWeakFrame::Init on 20091113-crashdata.csv
13   total crashes for nsWeakFrame::Init on 20091114-crashdata.csv
17   total crashes for nsWeakFrame::Init on 20091115-crashdata.csv
15   total crashes for nsWeakFrame::Init on 20091116-crashdata.csv
48   total crashes for nsWeakFrame::Init on 20091117-crashdata.csv

distribution of all versions where the nsWeakFrame::Init crash was found on 20091117-crashdata.csv
  22 Firefox 3.7a1pre
  11 Firefox 3.6b4pre
  10 Firefox 3.5.5
   3 Firefox 3.6b2
   1 Firefox 3.1b3
   1 Firefox 3.0.15
Summary: crash [@nsWeakFrame::InitExternal(nsIFrame*) ] checking for extension updates → crash [@ nsWeakFrame::Init(nsIFrame*) ] [@ nsWeakFrame::InitExternal(nsIFrame*) ] checking for extension updates
That's expected. The fix for bug 525579 moved a crash into there. Hopefully the fix for bug 529442 will remove it altogether.
Should we worry preventing a nsWeakFrame(aPoisonedFrame) crash (in Init*)?
Is nsWeakFrame::GetFrame guaranteed to return nsnull for a dead frame?
I wonder how curFrame->GetNextSibling() can return bad frame?
I spun off security bug 529834 as a dependency.
(In reply to comment #17)
> I wonder how curFrame->GetNextSibling() can return bad frame?

Yes, I guess it is possible that a frame doesn't know if its sibling is dead.
Possibly we'll need sibling deallocation to be atomic?
(In reply to comment #14)
> Fresh Linux crash stacks from aja:
> http://crash-stats.mozilla.com/report/index/13e5ce2e-2108-47cb-b5ad-72ce22091119
> http://crash-stats.mozilla.com/report/index/2b361537-dc10-47ad-a316-c4e862091119
another....without any of the initial extension update (for new firebug release) dialog stuff the other two had:
http://crash-stats.mozilla.com/report/index/a96616e6-43ae-4c73-a997-51f1e2091119
Bolter: who's gonna own this?
Flags: blocking1.9.2? → blocking1.9.2+
(In reply to comment #22)
> Bolter: who's gonna own this?

I'll stick my neck out for now.
Assignee: nobody → bolterbugz
OK, it looks like this really is a bad frame tree. I see that the frame tree (for the Addons window, I think) contains an nsMenuPopupFrame whose mNextSibling is pointing to a poisoned frame.
OK, this is all about nsPopupSetFrame. It keeps its child nsMenuPopupFrames in a special data structure that's not a regular frame list. When frames are added to or removed from this data structure, their mNextSibling/mPrevSibling pointers aren't updated.
Great work, Robert! Can you fix it? :)
Attached patch fixSplinter Review
Arguably accessibility shouldn't be looking at the children of the nsPopupSetFrame. But it's definitely unwise for nsPopupSetFrame to let its frame children have bogus prev/next sibling pointers.

I don't know why nsPopupSetFrame doesn't just keep its popup children in a regular nsFrameList. Something to ask Neil when he gets back.
Assignee: bolterbugz → roc
Attachment #413578 - Flags: review?(bzbarsky)
Whiteboard: [needs review]
(In reply to comment #28)
> Arguably accessibility shouldn't be looking at the children of the
> nsPopupSetFrame.

If you have a better suggestion how we can get to the same info, we're all ears!
(In reply to comment #30)
> (In reply to comment #28)
> > Arguably accessibility shouldn't be looking at the children of the
> > nsPopupSetFrame.
> 
> If you have a better suggestion how we can get to the same info, we're all
> ears!

We just need to traverse the DOM in this case like I did in bug 527466.
Tryserver build referenced at https://bugzilla.mozilla.org/show_bug.cgi?id=527466#c11 seems to fix this on m-c trunk linux.
Thanks aja, I think we want to take that patch and certainly Roc's.
Comment on attachment 413578 [details] [diff] [review]
fix

r+.

Want to call this a fix for bug 504144?
Attachment #413578 - Flags: review?(bzbarsky) → review+
Whiteboard: [needs review] → [needs landing]
Whiteboard: [needs landing] → [needs 192 landing]
Landed on 192: http://hg.mozilla.org/releases/mozilla-1.9.2/rev/490f5eb3f100

Thanks Roc. Closing as fixed 192
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Whiteboard: [needs 192 landing]
Confirmed fixed on Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.3a1pre) Gecko/20091121 Minefield/3.7a1pre (.NET CLR 3.5.30729). Also the delay I was seeing in the try-server builds before this patch landed is gone again, the Add-Ons manager comes up instantly. Leaving on "RESOLVED" until Aja confirms the fix on Linux.
Verified fixed on Windows in Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2b4pre) Gecko/20091121 Namoroka/3.6b4pre (.NET CLR 3.5.30729). Also I don't see the delay any longer that I saw with try-server builds of Namoroka after the fix of bug 529442, but before this fix got in. All is well from my end now, but I'll leave the verified1.9.2 keyword out until Aja can confirm that the problem is also fixed for him on Linux.
verified1.9.2 with linux nightly build Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2b4pre) Gecko/20091122 Namoroka/3.6b4pre ID:20091122033700
Keywords: verified1.9.2
Blocks: 504144
Is this a good fix to take on older branches?
blocking1.9.1: --- → ?
status1.9.1: --- → ?
(In reply to comment #40)
> verified1.9.2 with linux nightly build Mozilla/5.0 (X11; U; Linux i686; en-US;
> rv:1.9.2b4pre) Gecko/20091122 Namoroka/3.6b4pre ID:20091122033700

FWIW, working okay on linux 3.6b4 build1, too.
Setting to verified based on recent comments. Thanks everyone!
Status: RESOLVED → VERIFIED
blocking1.9.1: ? → .7+
Flags: wanted1.9.0.x?
Flags: blocking1.9.0.17?
Flags: wanted1.9.0.x?
Flags: wanted1.9.0.x+
Flags: blocking1.9.0.17?
Flags: blocking1.9.0.17+
(In reply to comment #28)

> I don't know why nsPopupSetFrame doesn't just keep its popup children in a
> regular nsFrameList. Something to ask Neil when he gets back.

Probably to avoid having to override all the various layout methods, and for the same reasons that positioned elements use a separate list.
I mean, keep the frames out of the main child list (i.e. GetChildList(nsnull) returns an empty list), in nsGkAtoms::popupList like now ... but instead of storing the frames in a custom linked list structure, just use an nsFrameList.
No, I see no reason for the special framelist.
Don't know why I marked this one fixed on the 1.9.1 branch, perhaps confusion with the blocking flag. The patch in the bug definitely has not been checked into the 1.9.1 branch (not sure if it's correct for the branch as-is) and crash-stats still shows crashes as nsWeakFrame::Init(nsIFrame*) in 3.5.x
Roc: please get someone to work up a 1.9.1 version of this patch (does this one apply correctly? seems simple enough to), we'd like to take this.
blocking1.9.1: .8+ → needed
Flags: blocking1.9.0.18+ → blocking1.9.0.19+
Comment on attachment 423598 [details] [diff] [review]
1.9.1 patch

Approved for 1.9.1.8, a=dveditz for release-drivers
Attachment #423598 - Flags: approval1.9.1.9? → approval1.9.1.8+
blocking1.9.1: needed → ---
Roc: any chance of a 1.9.0.19 patch?
I'm pretty sure the 1.9.1 patch will apply to 1.9.0 as-is.
It does apply cleanly.
Comment on attachment 423598 [details] [diff] [review]
1.9.1 patch

a=beltzner for 1.9.0.19
Attachment #423598 - Flags: approval1.9.0.19? → approval1.9.0.19+
Whiteboard: [needs 190 landing]
This crash (despite said to be fixed for a very long time) is referenced in this crash report of July 16, 2014:

https://crash-stats.mozilla.com/report/index/9b59be92-91ee-49f3-9f30-d26a62140716
You need to log in before you can comment on or make changes to this bug.