Closed
Bug 1150924
Opened 10 years ago
Closed 10 years ago
[Crash] [@ jemalloc_crash | arena_dalloc | je_realloc | replace_realloc ]
Categories
(Core :: DOM: Core & HTML, defect)
Tracking
()
RESOLVED
WONTFIX
| blocking-b2g | - |
People
(Reporter: ntroast, Unassigned)
References
Details
(Keywords: crash, Whiteboard: [b2g-crash][caf-crash 575][caf priority: p1][CR 817669])
Crash Data
Attachments
(5 files)
We observed the following crash signature during testing.
[@ jemalloc_crash | arena_dalloc | je_realloc | replace_realloc ]
Cafbot will upload the decoded minidump and extra file.
Comment 1•10 years ago
|
||
Comment 2•10 years ago
|
||
Comment 3•10 years ago
|
||
Johnny & Andrew,
Can you have someone(s) on the DOM team familiar with the media codebase look into this critical fxOS 2.2 crash asap. It's currently blocking CAF and is directly contributing to their inability to reach our MTBF goal.
As much as possible we need these fixed and available for CAF testing by Sunday, April 5th.
Thanks,
Mike
Crash Signature: [@ mozilla::MediaSegmentBase<mozilla::AudioSegment, mozilla::AudioChunk>::AppendSlice ]
Component: Stability → DOM
Flags: needinfo?(overholt)
Flags: needinfo?(jst)
Product: Firefox OS → Core
| Reporter | ||
Comment 4•10 years ago
|
||
Hi Mike, I think you may be confusing this bug with another.
This is [@ jemalloc_crash | arena_dalloc | je_realloc | replace_realloc ]
Group: qualcomm-confidential
Crash Signature: [@ mozilla::MediaSegmentBase<mozilla::AudioSegment, mozilla::AudioChunk>::AppendSlice ] → [@ jemalloc_crash | arena_dalloc | je_realloc | replace_realloc ]
Flags: needinfo?(mlee)
| Reporter | ||
Updated•10 years ago
|
Group: qualcomm-confidential
Comment 5•10 years ago
|
||
Hi Nick,
I added that signature after looking into the decoded minidump. Although jemalloc_crash is the last signature that looks to be code that could be entered from various paths. I added the other signature [1] since it appears to be the latest least-generic signature.
Mike
[1] mozilla::MediaSegmentBase<mozilla::AudioSegment, mozilla::AudioChunk>::AppendSlice
Flags: needinfo?(mlee)
| Reporter | ||
Comment 6•10 years ago
|
||
Oh! Sorry about that then. I seem to be the confused one :)
Crash Signature: [@ jemalloc_crash | arena_dalloc | je_realloc | replace_realloc ] → [@ mozilla::MediaSegmentBase<mozilla::AudioSegment, mozilla::AudioChunk>::AppendSlice ]
Updated•10 years ago
|
Whiteboard: [CR 817669]
Updated•10 years ago
|
Whiteboard: [CR 817669] → [caf priority: p1][CR 817669]
Updated•10 years ago
|
Whiteboard: [caf priority: p1][CR 817669] → [b2g-crash][caf-crash 575][caf priority: p1][CR 817669]
Comment 7•10 years ago
|
||
Observed on:
Device: msm8909
Gonk Version: AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.119
Moz BuildID: 20150330002503
Manifest: https://www.codeaurora.org/cgit/quic/lf/b2g/manifest/tree/caf_AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.119.xml?h=release
Gecko Version: 37.0
Gaia: http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=473cd63f53c855299b719285d9b95e3f2910782f
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=e44e28d2a37258fe0c908c59f2e91e5287777b40
Patches: bug 1133398, bug 1143694, bug 1146987, bug 1145724, bug 1147646, bug 1133147, bug 1150271, bug 1142770
Comment 8•10 years ago
|
||
Comment 9•10 years ago
|
||
Comment 10•10 years ago
|
||
The crash is in MediaStream code, which I'm not familiar with - it's used mostly by webrtc and webaudio. I'm not the right person to fix this, but here's my cursory analysis as a general Gecko developer:
The particular crash is here:
nsTArray_base<nsTArrayInfallibleAllocator, nsTArray_CopyWithMemutils>::EnsureCapacity
Which means that this is an OOM. The OOM happens here:
mozilla::MediaSegmentBase<mozilla::AudioSegment, mozilla::AudioChunk>::AppendSlice
Which is basically appending a mozilla::AudioChunk to an nsTArray. AudioChunk is defined in AudioSegment.h, but it doesn't appear to be very large. It holds onto potentially-large data via pointers, so that shouldn't be an issue when appending one of these things to the nsTArray.
It's worth verifying that our memory is basically shot by the time we hit this. If that's the case, then something else is chewing up memory and this is just how we die. If not, then I don't understand how we'd be dying in infallible allocation here.
Comment 11•10 years ago
|
||
Next steps would be to get people familiar with device debugging and OOM failures to have a look to see if we're OOMing and figure out if anything in particular is leaking or hogging memory.
Comment 12•10 years ago
|
||
From attachment 8588008 [details], the crash is because the arena going to be freed has corrupted magic signature, which I think STR is required to proceed.
Comment 13•10 years ago
|
||
Ting, this crash was produced during stability tests which involves monkey testing for several hours and there is no clear STR for this. If we are not able to identify the issue using provided logs then please feel free to provide us a debug patch with additional logging to identify the issue.
Flags: needinfo?(janus926)
Comment 14•10 years ago
|
||
(In reply to Ting-Yu Chou [:ting] from comment #12)
> From attachment 8588008 [details], the crash is because the arena going to
> be freed has corrupted magic signature, which I think STR is required to
> proceed.
You're right - I should have inspected more closely. This is corruption, not OOM:
https://hg.mozilla.org/mozilla-central/annotate/eeb9438975a5/memory/mozjemalloc/jemalloc.c#l4711
(In reply to Inder from comment #13)
> Ting, this crash was produced during stability tests which involves monkey
> testing for several hours and there is no clear STR for this. If we are not
> able to identify the issue using provided logs then please feel free to
> provide us a debug patch with additional logging to identify the issue.
Then this is probably going to be very hard to make progress on. Somebody somewhere in Gecko is corrupting memory somewhere. That's not really a lot to go on.
Updated•10 years ago
|
Flags: needinfo?(overholt)
Flags: needinfo?(jst)
Flags: needinfo?
Comment 15•10 years ago
|
||
Inder, could you try to disable mozjemalloc as bug 1148324 comment 9 and use the one in bionic?
Flags: needinfo?(ikumar)
Comment 16•10 years ago
|
||
(In reply to Inder from comment #13)
> Ting, this crash was produced during stability tests which involves monkey
> testing for several hours and there is no clear STR for this. If we are not
> able to identify the issue using provided logs then please feel free to
> provide us a debug patch with additional logging to identify the issue.
It's really hard to debug memory corruption by adding logs. I'd like to start from reproducing this locally. Was it crashed while running monkey test or some other stability tests?
BTW, attachment 8588007 [details] has following line before crash, anyone knows how to trigger this behavior?
04-02 17:49:25.231 260 752 D audio_hw_primary: start_output_stream: enter: stream(0xb8a4b170)usecase(1: low-latency-playback) devices(0x2)
Flags: needinfo?
Updated•10 years ago
|
Flags: needinfo?(janus926)
Comment 17•10 years ago
|
||
A debug patch which prints out 8 bytes before/after corrupted magic, see if it could provide any clues.
Comment 18•10 years ago
|
||
Nick - please land the patch in our build.
Flags: needinfo?(ikumar) → needinfo?(ntroast)
Updated•10 years ago
|
blocking-b2g: 2.2? → 2.2+
| Reporter | ||
Comment 19•10 years ago
|
||
The patch was added, but it may or may not be built into the very next AU.
Flags: needinfo?(ntroast)
Comment 21•10 years ago
|
||
according to bug 1151787, this error should not appear again.
Status: NEW → RESOLVED
blocking-b2g: 2.2+ → -
Closed: 10 years ago
Resolution: --- → WONTFIX
| Assignee | ||
Updated•7 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•