Closed
Bug 1089706
Opened 10 years ago
Closed 10 years ago
Crash in ObserverList<mozilla::dom::FMRadioEventType>::Broadcast while stability testing
Categories
(Core :: Hardware Abstraction Layer (HAL), defect)
Tracking
()
RESOLVED
WORKSFORME
blocking-b2g | 2.1+ |
People
(Reporter: ggrisco, Assigned: dhylands)
References
Details
(Keywords: crash, Whiteboard: [caf-crash 390][caf priority: p1][CR 746192][b2g-crash])
Crash Data
Attachments
(3 files)
319.21 KB,
text/plain
|
Details | |
614.23 KB,
text/plain
|
Details | |
14.22 KB,
patch
|
Details | Diff | Splinter Review |
[Blocking Requested - why for this release]:
Reporter | ||
Comment 1•10 years ago
|
||
This crash seen once so far on 8926 device after 5 or 6 hours of continous testing. STR not available.
[@ mozilla::ObserverListmozilla::dom::FMRadioEventType::Broadcast(mozilla::dom::FMRadioEventType const&) | mozilla::system::Volume::SetState(long) | mozilla::system::Volume::HandleVoldResponse(int, nsCWhitespaceTokenizerTemplate<&(NS_IsAsciiWhitespace(char16_t))>&) | mozilla::system::VolumeManager::HandleBroadcast(int, nsCString&) ]
cafbot will post the minidump & .extra file soon.
Comment 2•10 years ago
|
||
Comment 3•10 years ago
|
||
Component: DOM: Events → Hardware Abstraction Layer (HAL)
Comment 4•10 years ago
|
||
Blocking given this will block 2.1 CS for CAF.
Starting with :mwu given the FMRadio crash signature as he may have ideas here and see how that goes and see if we any more information from CAF here will help.
blocking-b2g: 2.1? → 2.1+
Updated•10 years ago
|
Flags: needinfo?(mwu)
Comment 5•10 years ago
|
||
Given the call stack, this may actually be a crash in the volume manager code instead. Let's see if dhylands has any ideas.
Flags: needinfo?(mwu) → needinfo?(dhylands)
Updated•10 years ago
|
Whiteboard: [b2g-crash] → [CR 746192][b2g-crash]
Updated•10 years ago
|
Whiteboard: [CR 746192][b2g-crash] → [caf priority: p1][CR 746192][b2g-crash]
Updated•10 years ago
|
Whiteboard: [caf priority: p1][CR 746192][b2g-crash] → [caf-crash 390][caf priority: p1][CR 746192][b2g-crash]
Comment 6•10 years ago
|
||
Observed on:
Device: msm8226
Gonk Version: AU_LINUX_GECKO_B2G_KK_2.0.01.04.00.114.115
Moz BuildID: 20141020001201
B2G Version: 2.1
Gecko Version: 34.0
Gaia: http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=2904ab80816896f569e2d73958427fb82aebaea5
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=cd9a5eb4d7cd43eb1b3c17e298e124823cdae2af
Patches: bug 1070431, bug 1083449
Assignee | ||
Comment 7•10 years ago
|
||
I think that the FMRadio thing is a red herring.
I also think that this is probably not a VolumeManager bug.
The stack indicates that the VolumeManager received an event from vold (Volume::HandleVoldResponse). The volume code then updates the state to match what it got from vold.
Part of SetState involves broadcasting that state change to any registered observers. The actual crash appears to have occurred here: http://dxr.mozilla.org/mozilla-central/source/xpcom/glue/Observer.h?from=Observer.h#72
which basically tells me that somebody probably trampled memory. vold status messages are very infrequent (only when a volume changes state). Did your tests involve USB Mass Storage?
I don't suppose you happen to have the logcat for this?
Flags: needinfo?(dhylands)
Reporter | ||
Comment 8•10 years ago
|
||
(In reply to Dave Hylands [:dhylands] from comment #7)
> I think that the FMRadio thing is a red herring.
>
> I also think that this is probably not a VolumeManager bug.
>
> The stack indicates that the VolumeManager received an event from vold
> (Volume::HandleVoldResponse). The volume code then updates the state to
> match what it got from vold.
>
> Part of SetState involves broadcasting that state change to any registered
> observers. The actual crash appears to have occurred here:
> http://dxr.mozilla.org/mozilla-central/source/xpcom/glue/Observer.
> h?from=Observer.h#72
>
> which basically tells me that somebody probably trampled memory. vold status
> messages are very infrequent (only when a volume changes state). Did your
> tests involve USB Mass Storage?
>
> I don't suppose you happen to have the logcat for this?
Hi Dave, the tests did indeed include USB Mass Storage (I think just enabling/disabling). Logcat around the time of crash can be found in the .extra file (already attached).
Flags: needinfo?(ggrisco)
Comment 9•10 years ago
|
||
Thanks Greg! Back to :dhylands for help (sry for the multiple poke's but this is impacting CAF stability and we need to have these resolved by 10/30)
Flags: needinfo?(dhylands)
Assignee | ||
Comment 10•10 years ago
|
||
The log appears to be incomplete (or doesn't match up with the dump).
The crash happens here:
http://dxr.mozilla.org/mozilla-central/source/dom/system/gonk/Volume.cpp?from=Volume.cpp#256
which means that one of these logs:
http://dxr.mozilla.org/mozilla-central/source/dom/system/gonk/Volume.cpp?from=Volume.cpp#197-208
should have been the last log recorded for the I/O Thread (thread 206/673 in the logcat)
There is one of those logs near the end, but there are a bunch of other logs from the same thread after this, so obviously this isn't the correct one.
So there is some log missing and we don't know how much.
Given the frequency of events, it looks like this is some type of instrumented sdcard insertion/removal test (For example - the MTP cycles through enable/disable 8 times in just over a second).
Flags: needinfo?(dhylands)
Assignee | ||
Updated•10 years ago
|
Flags: needinfo?(ggrisco)
Assignee | ||
Comment 11•10 years ago
|
||
ni? ggrisco to see if the missing portion of the log might be available someplace.
Reporter | ||
Comment 12•10 years ago
|
||
(In reply to Dave Hylands [:dhylands] from comment #11)
> ni? ggrisco to see if the missing portion of the log might be available
> someplace.
Hi Dave, it seems like what you're looking for is:
10-27 17:08:14.813 206 673 I VolumeManager: Volume sdcard (1): changing state from Idle to Pending (4 observers)
that comes near the end of the normal log. I forgot to mention that "--------- log /dev/log/radio" signals the start of the radio log which you can ignore in this case.
Flags: needinfo?(ggrisco)
Updated•10 years ago
|
Assignee: nobody → dhylands
Assignee | ||
Comment 13•10 years ago
|
||
Greg,
The log shows:
> 10-27 17:08:14.813 206 673 I VolumeManager: Volume sdcard (1): changing state from Idle to Pending (4 observers)
> 10-27 17:08:14.813 206 673 I AutoMounter: Stopping MtpServer
> 10-27 17:08:14.813 206 673 I MozMtp : ~MozMtpStorage: Storage destructed for Volume sdcard mStorageID 0x00010001
> 10-27 17:08:14.813 206 673 I AutoMounter: SetUsbFunction(mass_storage) sys.usb.config to 'diag,serial_smd,serial_tty,rmnet_bam,mass_storage,adb'
> 10-27 17:08:14.813 206 4203 I MozMtp : sessionEnded:
> 10-27 17:08:14.813 206 4203 I MozMtp : Run: MozMtpServer finished
> 10-27 17:08:14.813 206 673 I AutoMounter: AutoMounter state changed from MTP_STARTED to UMS_CONFIGURING
> 10-27 17:08:14.813 206 673 I AutoMounter: UpdateState: ums:A1C0E0 mtp:A1C1E0 mode:0 usb:1 tryToShare:0 state:UMS_CONFIGURING
> 10-27 17:08:14.813 206 673 I AutoMounter: UpdateState: Volume sdcard is Pending and missing
> 10-27 17:08:14.813 206 673 I AutoMounterSetting: Changing status from 'Enabled' to 'Disabled'
> 10-27 17:08:14.813 206 673 I AutoMounter: UpdateState: ums:A1C0E0 mtp:A1C0E0 mode:0 usb:1 tryToShare:0 state:UMS_CONFIGURING
> 10-27 17:08:14.813 206 673 I AutoMounter: UpdateState: Volume sdcard is Pending and missing
Since there are additional logs from the same thread (673) AFTER the log you pointed to, that can't be the one just before the crash.
Ergo - the log just before the crash is missing.
Assignee | ||
Updated•10 years ago
|
Flags: needinfo?(ggrisco)
Assignee | ||
Comment 14•10 years ago
|
||
Here's a logging patch that may give some more insight, if we can get a complete log.
I did discover a redundant call to Volume::RegisterObservers, while auditing the code, but I don't think that will cause the problem.
(In reply to bhavana bajaj [:bajaj] from comment #15)
> NI Inder, so CAF can try this out.
We asked our test team to reproduce this issue with patch from Comment 14.
(In reply to Tapas[:tkundu on #b2g/gaia/memshrink/gfx] (always NI me) from comment #16)
> (In reply to bhavana bajaj [:bajaj] from comment #15)
> > NI Inder, so CAF can try this out.
>
> We asked our test team to reproduce this issue with patch from Comment 14.
We are still waiting for feedback from our test team and we need more testing hours to reproduce this issue. I will update here asap. Sorry for delay :(
Flags: needinfo?(tkundu)
Updated•10 years ago
|
Flags: needinfo?(tkundu)
(In reply to Dave Hylands [:dhylands] from comment #14)
> Created attachment 8513929 [details] [diff] [review]
> A logging patch
>
> Here's a logging patch that may give some more insight, if we can get a
> complete log.
>
> I did discover a redundant call to Volume::RegisterObservers, while auditing
> the code, but I don't think that will cause the problem.
We are not seeing this issue with your logging patch. You can go ahead and close this bug. We will re-open it when we see it again
Flags: needinfo?(tkundu) → needinfo?(dhylands)
Reporter | ||
Comment 19•10 years ago
|
||
We haven't been able to reproduce this, and thus can't collect logs. I'm removing my ni? for now.
Flags: needinfo?(ggrisco)
Assignee | ||
Comment 20•10 years ago
|
||
There was one minor change made besides just the logs, but I can't fathom how that would contribute to a crash.
As an FYI: I made the non-logging change permanent over in bug 1091174 which has landed on b2g-inbound today.
Status: NEW → RESOLVED
Closed: 10 years ago
Flags: needinfo?(dhylands)
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•