Firefox Crash [@ JSC::ExecutableAllocator::sizeOfCode ]

RESOLVED FIXED in Firefox 14

Status

()

Core
JavaScript Engine
--
critical
RESOLVED FIXED
6 years ago
5 years ago

People

(Reporter: marcia, Assigned: njn)

Tracking

(4 keywords)

11 Branch
mozilla15
crash, regression, reproducible, topcrash
Points:
---

Firefox Tracking Flags

(firefox14 verified, firefox15 verified, blocking-fennec1.0 +)

Details

(Whiteboard: [native-crash], crash signature, URL)

Attachments

(1 attachment)

(Reporter)

Description

6 years ago
Seen while running the latest trunk build Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:11.0a1) Gecko/20111214 Firefox/11.0a1.

STR:

1. Load the site in the url field.
2. Wait a few seconds.
3. Crash.

https://crash-stats.mozilla.com/report/index/bp-7a764f39-8153-4ff8-af69-479702111214

Frame 	Module 	Signature [Expand] 	Source
0 	XUL 	JSC::ExecutableAllocator::sizeOfCode 	
1 	XUL 	JSThread::sizeOfIncludingThis 	js/src/jscntxt.cpp:161
2 	XUL 	mozilla::xpconnect::memory::CollectCompartmentStatsForRuntime 	js/xpconnect/src/XPCJSRuntime.cpp:1592
3 	XUL 	CollectRuntimeStatsRunnable::WorkerRun 	dom/workers/WorkerPrivate.cpp:1430
4 	XUL 	mozilla::dom::workers::WorkerRunnable::Run 	dom/workers/WorkerPrivate.cpp:1612
5 	XUL 	mozilla::dom::workers::WorkerPrivate::DoRunLoop 	dom/workers/WorkerPrivate.cpp:2442
6 	XUL 	WorkerThreadRunnable::Run 	dom/workers/RuntimeService.cpp:310
7 	XUL 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:625
8 	XUL 	NS_ProcessNextEvent_P 	obj-firefox/x86_64/xpcom/build/nsThreadUtils.cpp:245
9 	XUL 	nsThread::ThreadFunc 	xpcom/threads/nsThread.cpp:273
10 	libnspr4.dylib 	_pt_root 	nsprpub/pr/src/pthreads/ptthread.c:187
11 	libsystem_c.dylib 	libsystem_c.dylib@0x4e8be 	
12 	libsystem_c.dylib 	libsystem_c.dylib@0x51b74 	
13 	libnspr4.dylib 	PR_JoinThread 	nsprpub/pr/src/pthreads/ptthread.c:577
(Reporter)

Comment 1

6 years ago
https://crash-stats.mozilla.com/report/list?signature=JSC::ExecutableAllocator::sizeOfCode shows Mac and Linux crashes, but there appears to be a similar Windows signature as well: 
[@ JSC::ExecutableAllocator::sizeOfCode(unsigned int*, unsigned int*, unsigned int*) ] 

The first Mac crash shows up using 2011120800, while the first Windows crash shows up using 2011120700, so I guess that will help us narrow down the regression range.
Crash Signature: [@ JSC::ExecutableAllocator::sizeOfCode ] → [@ JSC::ExecutableAllocator::sizeOfCode ] [@ JSC::ExecutableAllocator::sizeOfCode(unsigned int*, unsigned int*, unsigned int*) ]
OS: Mac OS X → All
Hardware: x86 → All
(Reporter)

Comment 2

6 years ago
Possible regression range based on crash stats: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=fafaf614791f&tochange=658fad825c36

Comment 3

5 years ago
I can't reproduce with the STR in comment 0 in 12.0a1/20120121 and 11.0a2/20120121.

Anyway, there are still crashes from time to time.
It also happens in FennecAndroid/12.0a1.
Keywords: regression
Whiteboard: [native-crash]
Version: 9 Branch → 11 Branch

Comment 4

5 years ago
I remove the native-crash keyword as it seems related to crashes exclusively on mobile. See http://bit.ly/vO7SxB that excludes top crashers with this keyword.
Whiteboard: [native-crash]

Updated

5 years ago
Duplicate of this bug: 720193

Comment 6

5 years ago
Can someone clarify the native-crash keyword meaning when applied to a platform that includes ARM?
For top crashers, it means "only in native". See https://wiki.mozilla.org/Platform/2012-01-31#Desktop and https://wiki.mozilla.org/Platform/2012-01-31#Mobile_2
For non top crashers, it means "also in native".
I see this on Windows 7, with Aurora 12.0a2 from 2012-03-11, from the same URL.
(Reporter)

Comment 8

5 years ago
I was able to reproduce this using Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:13.0) Gecko/20120312 Firefox/13.0a1. I opened the site in the URL, interacted with it, and then when I closed the tab by clicking on the "x" I crashed: https://crash-stats.mozilla.com/report/index/bp-0e2762c4-1a80-4b27-811b-c773f2120312
(Assignee)

Comment 9

5 years ago
Created attachment 629019 [details] [diff] [review]
patch

I've worked this out.  Steps to reproduce reliably:

- Open arborjs.org in one tab.

- Open about:memory in another tab.

- Crash.

Sometimes I get the sizeOfCode() crash, sometimes I get a crash in pthread_cond_wait.  I'm pretty sure it's two different manifestations of the same defect.

It happens reliably in Aurora (FF14) and Beta (FF13).  It doesn't reproduce in FF15, but I think that's just luck, the code in question hasn't changed.

The problem is simple:  in ExecutableAllocator::sizeOfCode() we call m_pools.all() without first checking if m_pools.initialized().  The patch is trivial, and I added one JS_ASSERT(m_pools.initialized()) assertion elsewhere to protect against similar problems.  (In that place where I added the assertion, m_pools should always be initialized.)

With the patch applied I can't reproduce the crash on Aurora or Beta.

I've filed bug 760337 to add more assertions to HashTable.h so that problems of this nature will assert in debug builds in the future.

BTW, if you look at "hg blame" my fingerprints are all over sizeOfCode().  If you see a crash in code like that, please CC me on the bug!
Assignee: nobody → n.nethercote
Status: NEW → ASSIGNED
Attachment #629019 - Flags: review?(luke)

Updated

5 years ago
Attachment #629019 - Flags: review?(luke) → review+
(Assignee)

Comment 10

5 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/31bc7fbde2c0
Whiteboard: [leave open]
(Assignee)

Comment 11

5 years ago
Comment on attachment 629019 [details] [diff] [review]
patch


[Approval Request Comment]
This bug is a Fennec top crasher and a Fennec 1.0 soft blocker (see bug 739640).  The same code is used on desktop and it goes at least as far back as 10, so I think it's worth taking for Aurora, Beta and ESR.

User impact if declined: Null-deref crashes, quite frequent on some platforms (i.e. Fennec beta).

Fix Landed on Version: Only mozilla-inbound so far.

Risk to taking this patch (and alternatives if risky):  Negligible;  patch is trivial.

String or UUID changes made by this patch:  None.
Attachment #629019 - Flags: approval-mozilla-esr10?
Attachment #629019 - Flags: approval-mozilla-beta?
Attachment #629019 - Flags: approval-mozilla-aurora?
(Assignee)

Comment 12

5 years ago
> User impact if declined: Null-deref crashes, quite frequent on some
> platforms (i.e. Fennec beta).

Just to clarify:  if telemetry is enabled, these crashes can occur every time the telemetry ping occurs, which is once per minute.  If telemetry is disabled, these crashes can only occur when about:memory is viewed (or possibly if an extension that reads the memory reporters, such as about:nosy, is installed).

Updated

5 years ago
Duplicate of this bug: 739640

Comment 14

5 years ago
Nominating since a soft was duped to this.
Group: mozilla-confidential
blocking-fennec1.0: --- → ?

Comment 15

5 years ago
The topcrash keyword is only for FennecAndroid.
status-firefox13: --- → affected
status-firefox14: --- → affected
Component: DOM → JavaScript Engine
Keywords: topcrash
QA Contact: general → general
Whiteboard: [leave open] → [leave open][native-crash]

Updated

5 years ago
Group: mozilla-confidential
https://hg.mozilla.org/mozilla-central/rev/31bc7fbde2c0
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla15
Sorry, with asking about the group change, forgot the whiteboard annotation.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Status: REOPENED → ASSIGNED

Updated

5 years ago
status-firefox13: affected → ---
Comment on attachment 629019 [details] [diff] [review]
patch

[Triage Comment]
Not a top crasher on FF13 or ESR10. Leaving nom'd for Aurora, since it does affect Fennec Native.
Attachment #629019 - Flags: approval-mozilla-esr10?
Attachment #629019 - Flags: approval-mozilla-esr10-
Attachment #629019 - Flags: approval-mozilla-beta?
Attachment #629019 - Flags: approval-mozilla-beta-
blocking-fennec1.0: ? → soft
~200 crashes on beta 3 = release blocker.
blocking-fennec1.0: soft → +
Comment on attachment 629019 [details] [diff] [review]
patch

[Triage Comment]
Approved for Beta 14, however. We are now post merge.
Attachment #629019 - Flags: approval-mozilla-beta-
Attachment #629019 - Flags: approval-mozilla-beta+
Attachment #629019 - Flags: approval-mozilla-aurora?
(Assignee)

Comment 21

5 years ago
https://hg.mozilla.org/releases/mozilla-beta/rev/48d6aeca20a3

This landed on mozilla-central (which was FF15 at the time) a few days ago, and has now landed on mozilla-beta (which is now FF14).  I think that's all I need to do, please let me know if that's wrong.

(Hmm, should the target milestone be changed from mozilla15 to mozilla14?)
Whiteboard: [leave open][native-crash] → [native-crash]
I think the target milestone is fine (I've seen different people say different things about whether it should track initial landing versus followup), but you should set the "affected" flags as appropriate.
status-firefox14: affected → fixed
status-firefox15: --- → fixed
(Assignee)

Updated

5 years ago
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago5 years ago
Resolution: --- → FIXED
Mozilla/5.0 (Windows NT 6.1; rv:14.0) Gecko/20100101 Firefox/14.0

When trying to reproduce the original crash I got a different signature then the ones related to this bug:
https://crash-stats.mozilla.com/report/index/bp-e1dee497-88ad-4b8a-80d8-139822120628

I also found a crash in Socorro that happened after the patch for mozilla-beta landed:
https://crash-stats.mozilla.com/report/index/3b01e2d7-6bf7-49ca-913a-a055a2120618

But I could not reproduce the crash on the latest Nightly (Firefox 16.0a1) nor the latest beta (Firefox 14 beta 9).

Is this enough to set this as Verified on Firefox 14?
Keywords: verifyme
I could still not reproduce the crash on Firefox 15 beta 5 using the STR from comment 9.

But I found several crash reports on Socorro for Firefox 14.0.1 and Firefox 15 beta:

https://crash-stats.mozilla.com/report/index/b376e40f-c845-4336-a91e-fb0e82120816
https://crash-stats.mozilla.com/report/index/6480f290-bdcc-4f92-ab2b-522592120813
https://crash-stats.mozilla.com/report/index/30063a66-655b-40ba-a56e-e1cd92120801
https://crash-stats.mozilla.com/report/index/f9e6e70c-cd38-430a-adb2-b43652120814
https://crash-stats.mozilla.com/report/index/ad329ca1-c354-4103-a363-4eb392120723 

Please advise, are this crashes related with this bug?
(In reply to Simona B [QA] from comment #24)
> Please advise, are this crashes related with this bug?

They certainly seem to be. The most recent version in those reports is Firefox 15.0b4 which would certainly indicate this might not be fixed, or that there's a different code-path triggering this signature. Can someone from Engineering or Crashkill look into this?
(Assignee)

Comment 26

5 years ago
The original problem was an obvious defect, easily triggered, and had a simple fix.  Can we please open a new bug for the remaining crashes?
Filed Bug 785009. 

Based on the above comments, setting tracking flags for Firefox 14 and 15 to verified.

Thank you Nicholas for looking over these crashes.
status-firefox14: fixed → verified
status-firefox15: fixed → verified
Keywords: verifyme
QA Contact: simona.marcu
You need to log in before you can comment on or make changes to this bug.