Closed
Bug 1150916
Opened 10 years ago
Closed 10 years ago
[Crash][@ nsQueryInterface::operator() | nsCOMPtr_base::assign_from_qi]
Categories
(Firefox OS Graveyard :: Stability, defect)
Tracking
(blocking-b2g:2.2+, firefox38 wontfix, firefox39 wontfix, firefox40 fixed, b2g-v2.2 fixed, b2g-master fixed)
People
(Reporter: ntroast, Assigned: mccr8)
References
Details
(Keywords: crash, Whiteboard: [b2g-crash][caf-crash 598][caf priority: p1][CR 817665])
Attachments
(11 files, 1 obsolete file)
141.91 KB,
text/plain
|
Details | |
225.54 KB,
text/plain
|
Details | |
141.82 KB,
text/plain
|
Details | |
225.54 KB,
text/plain
|
Details | |
141.84 KB,
text/plain
|
Details | |
225.54 KB,
text/plain
|
Details | |
141.86 KB,
text/plain
|
Details | |
225.54 KB,
text/plain
|
Details | |
141.86 KB,
text/plain
|
Details | |
225.54 KB,
text/plain
|
Details | |
3.86 KB,
patch
|
froydnj
:
review+
kkuo
:
approval-mozilla-b2g37+
|
Details | Diff | Splinter Review |
We observed the following crash signature during testing. [@ nsQueryInterface::operator() | nsCOMPtr_base::assign_from_qi | xpc_TryUnmarkWrappedGrayObject | nsObserverList::UnmarkGrayStrongObservers ] Cafbot will upload the decoded minidump and extra file.
Comment 1•10 years ago
|
||
Comment 2•10 years ago
|
||
Comment 3•10 years ago
|
||
NI Birunthan/Froyd, to start with as they seem to have touched this area of code or can help direct to right people. Please note, this is a critical FXOS reqression impacting our critical release milestone and any help is appreciated.
Flags: needinfo?(nfroyd)
Flags: needinfo?(birunthan)
Comment 4•10 years ago
|
||
Uh. Looks like we have a bogus nsIObserver pointer while we're notifying observers during a CC while we're minimizing memory? ni?'ing mccr8 to see if the CC stack looks like something recognizable.
Flags: needinfo?(nfroyd) → needinfo?(continuation)
Assignee | ||
Comment 5•10 years ago
|
||
I haven't seen anything like this in particular before. It looks like we're QIing something and it crashes in there? The crash address, 0x1000000, is odd. Could the nsIObserver pointer be null? I'd hope we'd be safe from totally bogus pointers, but maybe not. poiru just did style clean up for this file.
Flags: needinfo?(continuation)
Flags: needinfo?(birunthan)
Reporter | ||
Comment 6•10 years ago
|
||
Looks like all the ni? got cleared here. Bhavana, based on the information provided so far is there anyone else we can loop in here?
Flags: needinfo?(bbajaj)
Assignee | ||
Comment 7•10 years ago
|
||
Nathan and I are looking into it.
Assignee: nobody → continuation
Flags: needinfo?(bbajaj)
Assignee | ||
Comment 8•10 years ago
|
||
Is this a crash that happened a single time, or one that has come up more than once?
Reporter | ||
Comment 9•10 years ago
|
||
This crash came up 4 times during the test run.
Comment 10•10 years ago
|
||
Exact same crash report (registers, crash address, etc.) each time?
Reporter | ||
Comment 11•10 years ago
|
||
I can have cafbot upload all the decoded minidumps.
Comment 12•10 years ago
|
||
Comment 13•10 years ago
|
||
Comment 14•10 years ago
|
||
Comment 15•10 years ago
|
||
Comment 16•10 years ago
|
||
Comment 17•10 years ago
|
||
Assignee | ||
Comment 18•10 years ago
|
||
Thanks. Yes, all of the crashes are on the same address, 0x1000000.
Updated•10 years ago
|
Whiteboard: [CR 817665]
Updated•10 years ago
|
Whiteboard: [CR 817665] → [caf priority: p1][CR 817665]
Updated•10 years ago
|
Whiteboard: [caf priority: p1][CR 817665] → [b2g-crash][caf-crash 598][caf priority: p1][CR 817665]
Comment 19•10 years ago
|
||
Observed on: Device: msm8909 Gonk Version: AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.119 Moz BuildID: 20150330002503 Manifest: https://www.codeaurora.org/cgit/quic/lf/b2g/manifest/tree/caf_AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.119.xml?h=release Gecko Version: 37.0 Gaia: http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=473cd63f53c855299b719285d9b95e3f2910782f Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=e44e28d2a37258fe0c908c59f2e91e5287777b40 Patches: bug 1133398, bug 1143694, bug 1146987, bug 1145724, bug 1147646, bug 1133147, bug 1150271, bug 1142770
Comment 20•10 years ago
|
||
Comment 21•10 years ago
|
||
Assignee | ||
Comment 22•10 years ago
|
||
I've been reading through this code but I haven't come up with anything in particular. The best idea I've had is to write some code that looks for this value we keep crashing on, and print out the particular observer message thing we're looking for if it finds it, then crashes. If we managed to hit that code, it would give us a better idea where things are going wrong, and, if it is related to the actual observer, a smaller amount of code to look at.
Comment 23•10 years ago
|
||
Do you want to provide a logging patch to identify the issue?
Flags: needinfo?(continuation)
Assignee | ||
Comment 24•10 years ago
|
||
I'll work on a logging patch. Benjamin, do you have any ideas? As perhaps the person most familiar with nsObserverService.
Flags: needinfo?(benjamin)
Comment 25•10 years ago
|
||
Crash reason: SIGSEGV Crash address: 0x1000000 That's a totally suspicious crash address (it's the content of r0). Not knowing much about b2g release engineering, do we have the disassembly of nsQueryInterface::operator() so that we know what is supposed to be in r0 at the time of this crash? And do we have a core which will show us whether 0x1000000 is a reasonable or crazy address? I am automatically suspicious here also because this is inside of a hashtable enumeration: if anyone is modifying the observer hashtable from within this stack, then the enumeration is bad: nsObserverList::UnmarkGrayStrongObservers -> xpc_TryUnmarkWrappedGrayObject
Flags: needinfo?(benjamin)
Assignee | ||
Comment 26•10 years ago
|
||
(In reply to Benjamin Smedberg [:bsmedberg] from comment #25) > I am automatically suspicious here also because this is inside of a > hashtable enumeration: if anyone is modifying the observer hashtable from > within this stack, then the enumeration is bad: > > nsObserverList::UnmarkGrayStrongObservers > -> xpc_TryUnmarkWrappedGrayObject Yeah, I suppose I should do address that before I go on more of a fishing expedition. xpc_TryUnmarkWrappedGrayObject does a QI then an UnmarkGrayGCThing. The latter is a JS engine thing that can't GC according to the static analysis, so that should be okay, but a QI could do anything, including triggering a GC, so that would be bad.
Updated•10 years ago
|
blocking-b2g: 2.2? → 2.2+
Assignee | ||
Comment 27•10 years ago
|
||
I have only tested this a little bit locally, but maybe this will help, if you want to try it out. This copies out all strong observers into a single array, then does the unmarking iteration over that. I could certainly see a QI for a JS-implemented thing somehow triggering a GC and causing observer problems, though I wasn't able to trigger that locally. We'll see how the try run goes: https://treeherder.mozilla.org/#/jobs?repo=try&revision=85ff1816a341
Assignee | ||
Updated•10 years ago
|
Flags: needinfo?(continuation)
Assignee | ||
Comment 28•10 years ago
|
||
The try run looks fine. You could see if this patch fixes the crash.
Flags: needinfo?(ikumar)
Comment 29•10 years ago
|
||
Thanks. Nick/Greg - can you please land it in our build.
Flags: needinfo?(ntroast)
Flags: needinfo?(ikumar)
Flags: needinfo?(ggrisco)
Assignee | ||
Comment 30•10 years ago
|
||
I just renamed a function and removed a comment. This probably makes sense to land even if it doesn't fix this particular issue.
Attachment #8588813 -
Flags: review?(nfroyd)
Assignee | ||
Updated•10 years ago
|
Attachment #8588758 -
Attachment is obsolete: true
Comment 31•10 years ago
|
||
(In reply to Inder from comment #29) > Thanks. > Nick/Greg - can you please land it in our build. Ok, working on it.
Flags: needinfo?(ntroast)
Flags: needinfo?(ggrisco)
Comment 32•10 years ago
|
||
Comment on attachment 8588813 [details] [diff] [review] Be sure we aren't iterating over a hashtable while it is modified in nsObserverService::UnmarkGrayStrongObservers(). Review of attachment 8588813 [details] [diff] [review]: ----------------------------------------------------------------- I think making the commit comment a little more explicit, something like: "It may be the case that the QI in xpc_TryUnmarkWrappedGrayObject could cause a GC, which might cause an observer to be destroyed, which might removeObserver() itself, which might mutate the hashtable that we're attempting to enumerate." would be good. I didn't fully grok what was going on until mccr8 made it clear to me on IRC.
Attachment #8588813 -
Flags: review?(nfroyd) → review+
Assignee | ||
Comment 33•10 years ago
|
||
Updated the comment to be more explicit. Thanks for the review. https://hg.mozilla.org/integration/mozilla-inbound/rev/074e4fb089b1
Keywords: leave-open
Comment 34•10 years ago
|
||
Comment on attachment 8588813 [details] [diff] [review] Be sure we aren't iterating over a hashtable while it is modified in nsObserverService::UnmarkGrayStrongObservers(). [Triage Comment]
Attachment #8588813 -
Flags: approval-mozilla-b2g37+
Comment 37•10 years ago
|
||
Follow-up fix for non-unified bustage: https://hg.mozilla.org/integration/mozilla-inbound/rev/a4dd555969f3 https://hg.mozilla.org/releases/mozilla-b2g37_v2_2/rev/1299110a2546
Comment 39•10 years ago
|
||
Since this is landed on 2.2. Mark the status as FIXED. -- Keven
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•10 years ago
|
Keywords: leave-open
Updated•10 years ago
|
status-b2g-v2.2:
--- → fixed
status-b2g-master:
--- → fixed
status-firefox38:
--- → wontfix
status-firefox39:
--- → wontfix
status-firefox40:
--- → fixed
Target Milestone: --- → 2.2 S10 (17apr)
You need to log in
before you can comment on or make changes to this bug.
Description
•