Last Comment Bug 527567 - Crash @ [@nsCOMPtr_base::assign_from_qi(nsQueryInterface, nsID const&) ]
: Crash @ [@nsCOMPtr_base::assign_from_qi(nsQueryInterface, nsID const&) ]
Status: RESOLVED DUPLICATE of bug 528134
[crashkill][sg:dupe 528134]
: crash
Product: Core
Classification: Components
Component: General (show other bugs)
: 1.9.2 Branch
: x86 All
: -- critical (vote)
: ---
Assigned To: Nobody; OK to take it and work on it
:
:
Mentors:
Depends on: 528134 529452 530684
Blocks: PoisonFrameCrash
  Show dependency treegraph
 
Reported: 2009-11-09 14:51 PST by Carsten Book [:Tomcat]
Modified: 2011-06-13 10:01 PDT (History)
14 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments

Description Carsten Book [:Tomcat] 2009-11-09 14:51:42 PST
Currently #40 of the TopCrash Stats 

http://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A3.6b1&query_search=signature&query_type=exact&query=&date=&range_value=1&range_unit=weeks&do_query=1&signature=nsCOMPtr_base%3A%3Aassign_from_qi%28nsQueryInterface%2C%20nsID%20const%26%29

other then "Google PageSpeed plugin crashed when analysing a page" no other steps to reproduce mentioned - investigating
Comment 1 chris hofmann 2009-11-10 08:51:08 PST
somne of these like the report with the comment about google page speed are frame poisoning bugs.


http://crash-stats.mozilla.com/report/index/91c44378-9884-4ba5-8e87-95a6c2091109

Frame  	Module  	Signature [Expand]  	Source
0 	xul.dll 	nsCOMPtr_base::assign_from_qi 	obj-firefox/xpcom/build/nsCOMPtr.cpp:96
1 	xul.dll 	nsCOMPtr<nsICSSStyleRule>::operator= 	obj-firefox/dist/include/nsCOMPtr.h:658
2 	xul.dll 	inDOMUtils::GetCSSStyleRules 	layout/inspector/src/inDOMUtils.cpp:181
3 	xul.dll 	NS_InvokeByIndex_P 	xpcom/reflect/xptcall/src/md/win32/xptcinvoke.cpp:101
4 	xul.dll 	XPCWrappedNative::CallMethod 	js/src/xpconnect/src/xpcwrappednative.cpp:2710

sort base on crash address to see more  0xfffffffff0dea7ff
Comment 2 Zack Weinberg (:zwol) 2009-11-10 09:38:22 PST
These reports are badly aggregated -- nsCOMPtr_base::assign_from_qi is called from a handful of nsCOMPtr methods which in turn are called from, like, *everywhere*.  Can we have the crash analysis peel away all frames that are nsCOMPtr or nsCOMPtr_base methods, and aggregate based on their caller?  (inDOMUtils::GetCSSStyleRules in the above example)

I didn't look at all of the crashes at f0dea7ff, but a lot of them have inDOMUtils::GetCSSStyleRules as the third stack frame.   The crash is on line 181 of this loop:

hg@1   178  nsCOMPtr<nsICSSStyleRule> cssRule;
hg@1   179  nsCOMPtr<nsIDOMCSSRule> domRule;
dbaron@31313   180  for ( ; !ruleNode->IsRoot(); ruleNode = ruleNode->GetParent()) {
dbaron@31313   181    cssRule = do_QueryInterface(ruleNode->GetRule());
hg@1   182    if (cssRule) {
hg@1   183      cssRule->GetDOMRule(getter_AddRefs(domRule));
hg@1   184      if (domRule)
hg@1   185        rules->InsertElementAt(domRule, 0);
hg@1   186    }
hg@1   187  }

Rule nodes are frame arena objects; nsICSSStyleRule objects aren't. It looks to me like a rule node was destroyed while still reachable.
Comment 3 Daniel Veditz [:dveditz] 2009-11-10 10:49:30 PST
(In reply to comment #2)
> Can we have the crash analysis peel away all frames that are
> nsCOMPtr or nsCOMPtr_base methods, and aggregate based on their caller? 
> (inDOMUtils::GetCSSStyleRules in the above example)

Yes, file a bug against Webtools:Socorro like bug 504498. Which then apparently results in a second IT bug being filed but since that one's hidden I can't tell you what details you'd need to cut out the middleman.
Comment 4 Zack Weinberg (:zwol) 2009-11-10 11:06:38 PST
I reopened bug 504498 for the crash analysis adjustment, since Lars seemed to prefer that.
Comment 5 Boris Zbarsky [:bz] (still a bit busy) 2009-11-10 11:30:26 PST
> It looks to me like a rule node was destroyed while still reachable.

There's actually an obvious way this can happen, I think.  inDOMUtils::GetRuleNodeForContent is clearly buggy: it returns the rulenode, but not the style context.  If the element is in a display:none subtree and if the destructor for the sContext local triggers rulenode gc, this can easily hand back a dead object, no?  Let's get a bug filed on that, blocking this one?
Comment 6 Boris Zbarsky [:bz] (still a bit busy) 2009-11-11 21:02:10 PST
Filed bug 528134.
Comment 7 Daniel Veditz [:dveditz] 2009-11-15 10:28:15 PST
Is this turning into a meta bug with dependents? Not sure what sg severity to assign it.
Comment 8 Boris Zbarsky [:bz] (still a bit busy) 2009-11-15 12:44:40 PST
> Is this turning into a meta bug with dependents?

It sort of has to, due to comment 2, no?  Crash in this function could be all sorts of stuff depending on what else is on the stack.
Comment 9 Marcia Knous [:marcia - use ni] 2009-11-17 16:47:41 PST
I hit this crash when exiting Firefox on Windows 7 - http://crash-stats.mozilla.com/report/index/bp-b1a37de0-e9aa-4c96-9761-2c3352091117 is my report but I have not been able to reproduce.
Comment 10 Boris Zbarsky [:bz] (still a bit busy) 2009-11-17 19:04:17 PST
Filed bug 529452 on Marcia's crash.
Comment 11 Mike Beltzner [:beltzner, not reading bugmail] 2009-11-20 11:23:11 PST
Don't block on metas, sorry; please nominate the crashes as we discover them.
Comment 12 chris hofmann 2009-11-20 17:20:24 PST
once the skip list change in  Bug 504498 goes in  we should be able to analysis of nsCOMPtr_base::assign_from_qi that have addresses with 0xfffffffff0dea7ff, right?

I marked bug 504498 as blocking. 

when that is done, we should have a specific bug for nsCOMPtr_base::assign_from_qi that have addresses with 0xfffffffff0dea7ff 

or we could use this one, any preferences?
Comment 13 Zack Weinberg (:zwol) 2009-11-20 17:58:02 PST
(In reply to comment #12)
> once the skip list change in  Bug 504498 goes in  we should be able to analysis
> of nsCOMPtr_base::assign_from_qi that have addresses with 0xfffffffff0dea7ff,
> right?

I don't understand this statement.  The requested skip list change should remove *all* instances of nsCOMPtr[_base]::<anything> from the crash statistics - instead they'll show up under the callers.
Comment 14 chris hofmann 2009-11-20 18:47:09 PST
ok, I'll try and rephrase

  in the case of comment 2  we should see the signature change from
  nsCOMPtr_base::assign_from_qi(nsQueryInterface, nsID const&) with crash address
  w/  crash address 0xfffffffff0dea7ff 

change to a signature that might look like

  nsCOMPtr<nsICSSStyleRule>::operator= 

our options are to change the title of this bug and use it to track the bug comment 2, or to file a new one, correct?

we don't have a clean way of handling changes caused by updates to the skip list.  maybe it's just better to start with new clean bugs.
Comment 15 Boris Zbarsky [:bz] (still a bit busy) 2009-11-20 18:55:51 PST
We already have a separate bug tracking the bug of comment 2.  It's blocking this bug.  See comment 6.
Comment 16 Zack Weinberg (:zwol) 2009-11-20 19:25:54 PST
I think the thing to do is file new bugs for each new signature, once we have the new signatures.  They could block this bug, but I'd be happy with closing this one and tracking them as dependencies of bug 526587 instead.
Comment 17 David Baron :dbaron: ⌚️UTC-10 2009-11-20 20:49:52 PST
With the requested skip-list change, I think the signature for http://crash-stats.mozilla.com/report/index/91c44378-9884-4ba5-8e87-95a6c2091109 should show up as:

nsCOMPtr_base::assign_from_qi | nsCOMPtr<nsICSSStyleRule>::operator= | inDOMUtils::GetCSSStyleRules

That is, things in the skip list get concatenated together until something not in the skip list is found.  (But numeric addresses are just skipped completely.)
Comment 18 Damon Sicore (:damons) 2009-11-23 15:45:18 PST
I'm changing this back to a nom for 1.9.2 because we need to get a skiplist change in asap to see if this is a regression from 3.5.  Right now, we can't tell.  Jonas to file a new bug and add this as a dependency.
Comment 19 Samuel Sidler (old account; do not CC) 2009-11-23 19:28:03 PST
(In reply to comment #18)
> Jonas to file a new bug and add this as a dependency.

(bug 530684)
Comment 20 David Baron :dbaron: ⌚️UTC-10 2009-11-24 17:13:26 PST
Now that the skiplist changes have landed, we're getting the reports split up.

The biggest chunk of Firefox3.6b3 crash reports at this signature are the ones from doGetObjectPrincipal; I think that's the way bug 521844 shows up on Mac (it shows up as doGetObjectPrincipal itself on Windows).

The second biggest chunk (and the biggest chunk on Windows) are bug 528134; those are the ones with a frame poisoning signature.
Comment 21 Mike Beltzner [:beltzner, not reading bugmail] 2009-11-26 12:49:31 PST
So does that make this a regression or not? Still doesn't feel like this specific bug is something that we can block on. Bug 528134 was resolved as a duplicate of bug 519719 which is a blocker, and bug 521844 is fixed on trunk and mozilla1.9.2, so removing the nomination again.
Comment 22 Mike Beltzner [:beltzner, not reading bugmail] 2009-11-26 13:21:28 PST
(accursed drop down fields!)
Comment 23 David Baron :dbaron: ⌚️UTC-10 2009-11-26 13:24:23 PST
I think we should just file separate bugs for each remaining problem.

Given that most of the technical discussion in this bug was about one of the problems (things about google page speed, nsICSSStyleRule, etc.), though, I'll mark it as a duplicate of that one.

*** This bug has been marked as a duplicate of bug 528134 ***

Note You need to log in before you can comment on or make changes to this bug.