Closed Bug 1044254 Opened 10 years ago Closed 10 years ago

Crash in GC while stability testing

Categories

(Core :: JavaScript: GC, defect)

ARM
Gonk (Firefox OS)
defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME
blocking-b2g 2.0+

People

(Reporter: ggrisco, Assigned: terrence)

References

(Blocks 1 open bug)

Details

(Keywords: crash, Whiteboard: [caf-crash 314][caf priority: p2][CR 699984][b2g-crash])

Crash Data

Attachments

(2 files)

[Blocking Requested - why for this release]:

Found while running stability tests overnight in areas of SMS, BT, Wifi, camera, and video.

[@ js::MarkAtoms(JSTracer*) | js::gc::GCRuntime::markRuntime(JSTracer*, bool) | js::gc::GCRuntime::beginMarkPhase() | js::gc::GCRuntime::incrementalCollectSlice(long long, JS::gcreason::Reason, js::JSGCInvocationKind) ]
Attached file decoded minidump
Whiteboard: [b2g-crash] → [CR 699984][b2g-crash]
Whiteboard: [CR 699984][b2g-crash] → [caf priority: p1][CR 699984][b2g-crash]
Whiteboard: [caf priority: p1][CR 699984][b2g-crash] → [caf-crash 314][caf priority: p1][CR 699984][b2g-crash]
Observed on: 

Device: msm8610
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.040
Moz BuildID: 20140716000201
B2G Version: 2.0
Gecko Version: 32.0a2
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=5f8b1b8a2da9e3b531eee817a669f57fa4d9b9c6
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=e00f7e464333689fcf54edb4945ece94f97f930b
blocking-b2g: 2.0? → 2.0+
NI on :naveed to help with an assignee to get started with investigation here.
Flags: needinfo?(nihsanullah)
Is the EXTRA file STR? If so, how do we make use of them?
Blocks: GC.stability
(In reply to Terrence Cole [:terrence] from comment #5)
> Is the EXTRA file STR? If so, how do we make use of them?

Those are just logs from the time of the crash, unfortunately given these crashes appear under a stability test environment I don't think CAF can provide us specific STR. We'll have to debug by stack trace and minidumps or add a debugging patch to get more information that may help here.
The crashing pointer is <1MiB (but not small). This indicates that the memory is neither jemalloc, nor GC. Moreover, since it is an atom, it cannot be an ExternalString pointing to a C literal. Rather, it has to be random heap corruption that the GC is tripping over.

Since the proximate cause of the crash is buggy code that trashed the heap long before the GC occurred, this is, unfortunately, completely unactionable. Our best option from here is to run the stability tests under valgrind and thread-sanitizer and hope that the corruption occurs in a way that one of those tools can detect it. We're actively working on our own tools to help make these sorts of errors more actionable, but there is nothing we can do here in the 2.0 timeframe.
Flags: needinfo?(nihsanullah)
Passing this to Terrence for now, as he is helping investigate this. :terrence as mentioned offline  we can have debug patches, Greg from CAF can also help provide more information or additional logs if needed.
Assignee: nobody → terrence
Inder, can you confirm the graphics issue on CAF side we were suspecting to cause this is definitely out of picture ? Also terrence mentioned offline the next best step to investigate is to rebuild with
--enable-debug in the configuration options.
Flags: needinfo?(ikumar)
Whiteboard: [caf-crash 314][caf priority: p1][CR 699984][b2g-crash] → [caf-crash 314][caf priority: p2][CR 699984][b2g-crash]
We have not seen this issue in recent testing for a week. We will reopen if it shows up again in latest reports.
Status: NEW → RESOLVED
Closed: 10 years ago
Flags: needinfo?(ikumar)
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: