Last Comment Bug 710922 - Firefox Crash [@ JSC::ExecutableAllocator::sizeOfCode ]
: Firefox Crash [@ JSC::ExecutableAllocator::sizeOfCode ]
Status: RESOLVED FIXED
[native-crash]
: crash, regression, reproducible, topcrash
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: 11 Branch
: All All
: -- critical (vote)
: mozilla15
Assigned To: Nicholas Nethercote [:njn] (on vacation until July 11)
: Simona B [:simonab]
Mentors:
http://arborjs.org/
: 720193 739640 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-12-14 16:16 PST by Marcia Knous [:marcia - use ni]
Modified: 2012-08-23 01:16 PDT (History)
12 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
verified
verified
+


Attachments
patch (1.95 KB, patch)
2012-05-31 18:19 PDT, Nicholas Nethercote [:njn] (on vacation until July 11)
luke: review+
akeybl: approval‑mozilla‑beta+
akeybl: approval‑mozilla‑esr10-
Details | Diff | Review

Description Marcia Knous [:marcia - use ni] 2011-12-14 16:16:35 PST
Seen while running the latest trunk build Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:11.0a1) Gecko/20111214 Firefox/11.0a1.

STR:

1. Load the site in the url field.
2. Wait a few seconds.
3. Crash.

https://crash-stats.mozilla.com/report/index/bp-7a764f39-8153-4ff8-af69-479702111214

Frame 	Module 	Signature [Expand] 	Source
0 	XUL 	JSC::ExecutableAllocator::sizeOfCode 	
1 	XUL 	JSThread::sizeOfIncludingThis 	js/src/jscntxt.cpp:161
2 	XUL 	mozilla::xpconnect::memory::CollectCompartmentStatsForRuntime 	js/xpconnect/src/XPCJSRuntime.cpp:1592
3 	XUL 	CollectRuntimeStatsRunnable::WorkerRun 	dom/workers/WorkerPrivate.cpp:1430
4 	XUL 	mozilla::dom::workers::WorkerRunnable::Run 	dom/workers/WorkerPrivate.cpp:1612
5 	XUL 	mozilla::dom::workers::WorkerPrivate::DoRunLoop 	dom/workers/WorkerPrivate.cpp:2442
6 	XUL 	WorkerThreadRunnable::Run 	dom/workers/RuntimeService.cpp:310
7 	XUL 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:625
8 	XUL 	NS_ProcessNextEvent_P 	obj-firefox/x86_64/xpcom/build/nsThreadUtils.cpp:245
9 	XUL 	nsThread::ThreadFunc 	xpcom/threads/nsThread.cpp:273
10 	libnspr4.dylib 	_pt_root 	nsprpub/pr/src/pthreads/ptthread.c:187
11 	libsystem_c.dylib 	libsystem_c.dylib@0x4e8be 	
12 	libsystem_c.dylib 	libsystem_c.dylib@0x51b74 	
13 	libnspr4.dylib 	PR_JoinThread 	nsprpub/pr/src/pthreads/ptthread.c:577
Comment 1 Marcia Knous [:marcia - use ni] 2011-12-14 16:23:09 PST
https://crash-stats.mozilla.com/report/list?signature=JSC::ExecutableAllocator::sizeOfCode shows Mac and Linux crashes, but there appears to be a similar Windows signature as well: 
[@ JSC::ExecutableAllocator::sizeOfCode(unsigned int*, unsigned int*, unsigned int*) ] 

The first Mac crash shows up using 2011120800, while the first Windows crash shows up using 2011120700, so I guess that will help us narrow down the regression range.
Comment 2 Marcia Knous [:marcia - use ni] 2011-12-14 16:57:01 PST
Possible regression range based on crash stats: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=fafaf614791f&tochange=658fad825c36
Comment 3 Scoobidiver (away) 2012-01-21 13:24:36 PST
I can't reproduce with the STR in comment 0 in 12.0a1/20120121 and 11.0a2/20120121.

Anyway, there are still crashes from time to time.
It also happens in FennecAndroid/12.0a1.
Comment 4 Scoobidiver (away) 2012-01-22 01:59:18 PST
I remove the native-crash keyword as it seems related to crashes exclusively on mobile. See http://bit.ly/vO7SxB that excludes top crashers with this keyword.
Comment 5 Scoobidiver (away) 2012-01-22 07:51:08 PST
*** Bug 720193 has been marked as a duplicate of this bug. ***
Comment 6 Scoobidiver (away) 2012-02-10 02:23:45 PST
Can someone clarify the native-crash keyword meaning when applied to a platform that includes ARM?
For top crashers, it means "only in native". See https://wiki.mozilla.org/Platform/2012-01-31#Desktop and https://wiki.mozilla.org/Platform/2012-01-31#Mobile_2
For non top crashers, it means "also in native".
Comment 7 Dirkjan Ochtman (:djc) 2012-03-12 02:25:48 PDT
I see this on Windows 7, with Aurora 12.0a2 from 2012-03-11, from the same URL.
Comment 8 Marcia Knous [:marcia - use ni] 2012-03-12 11:16:19 PDT
I was able to reproduce this using Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:13.0) Gecko/20120312 Firefox/13.0a1. I opened the site in the URL, interacted with it, and then when I closed the tab by clicking on the "x" I crashed: https://crash-stats.mozilla.com/report/index/bp-0e2762c4-1a80-4b27-811b-c773f2120312
Comment 9 Nicholas Nethercote [:njn] (on vacation until July 11) 2012-05-31 18:19:51 PDT
Created attachment 629019 [details] [diff] [review]
patch

I've worked this out.  Steps to reproduce reliably:

- Open arborjs.org in one tab.

- Open about:memory in another tab.

- Crash.

Sometimes I get the sizeOfCode() crash, sometimes I get a crash in pthread_cond_wait.  I'm pretty sure it's two different manifestations of the same defect.

It happens reliably in Aurora (FF14) and Beta (FF13).  It doesn't reproduce in FF15, but I think that's just luck, the code in question hasn't changed.

The problem is simple:  in ExecutableAllocator::sizeOfCode() we call m_pools.all() without first checking if m_pools.initialized().  The patch is trivial, and I added one JS_ASSERT(m_pools.initialized()) assertion elsewhere to protect against similar problems.  (In that place where I added the assertion, m_pools should always be initialized.)

With the patch applied I can't reproduce the crash on Aurora or Beta.

I've filed bug 760337 to add more assertions to HashTable.h so that problems of this nature will assert in debug builds in the future.

BTW, if you look at "hg blame" my fingerprints are all over sizeOfCode().  If you see a crash in code like that, please CC me on the bug!
Comment 10 Nicholas Nethercote [:njn] (on vacation until July 11) 2012-05-31 19:32:43 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/31bc7fbde2c0
Comment 11 Nicholas Nethercote [:njn] (on vacation until July 11) 2012-05-31 19:38:47 PDT
Comment on attachment 629019 [details] [diff] [review]
patch


[Approval Request Comment]
This bug is a Fennec top crasher and a Fennec 1.0 soft blocker (see bug 739640).  The same code is used on desktop and it goes at least as far back as 10, so I think it's worth taking for Aurora, Beta and ESR.

User impact if declined: Null-deref crashes, quite frequent on some platforms (i.e. Fennec beta).

Fix Landed on Version: Only mozilla-inbound so far.

Risk to taking this patch (and alternatives if risky):  Negligible;  patch is trivial.

String or UUID changes made by this patch:  None.
Comment 12 Nicholas Nethercote [:njn] (on vacation until July 11) 2012-05-31 19:42:35 PDT
> User impact if declined: Null-deref crashes, quite frequent on some
> platforms (i.e. Fennec beta).

Just to clarify:  if telemetry is enabled, these crashes can occur every time the telemetry ping occurs, which is once per minute.  If telemetry is disabled, these crashes can only occur when about:memory is viewed (or possibly if an extension that reads the memory reporters, such as about:nosy, is installed).
Comment 13 JP Rosevear [:jpr] 2012-05-31 20:31:54 PDT
*** Bug 739640 has been marked as a duplicate of this bug. ***
Comment 14 JP Rosevear [:jpr] 2012-05-31 20:37:09 PDT
Nominating since a soft was duped to this.
Comment 15 Scoobidiver (away) 2012-06-01 01:25:36 PDT
The topcrash keyword is only for FennecAndroid.
Comment 16 Ed Morley [:emorley] 2012-06-01 08:15:09 PDT
https://hg.mozilla.org/mozilla-central/rev/31bc7fbde2c0
Comment 17 Ed Morley [:emorley] 2012-06-01 08:15:37 PDT
Sorry, with asking about the group change, forgot the whiteboard annotation.
Comment 18 Alex Keybl [:akeybl] 2012-06-01 14:19:27 PDT
Comment on attachment 629019 [details] [diff] [review]
patch

[Triage Comment]
Not a top crasher on FF13 or ESR10. Leaving nom'd for Aurora, since it does affect Fennec Native.
Comment 19 Joe Drew (not getting mail) 2012-06-04 13:52:10 PDT
~200 crashes on beta 3 = release blocker.
Comment 20 Alex Keybl [:akeybl] 2012-06-04 13:52:22 PDT
Comment on attachment 629019 [details] [diff] [review]
patch

[Triage Comment]
Approved for Beta 14, however. We are now post merge.
Comment 21 Nicholas Nethercote [:njn] (on vacation until July 11) 2012-06-04 16:14:10 PDT
https://hg.mozilla.org/releases/mozilla-beta/rev/48d6aeca20a3

This landed on mozilla-central (which was FF15 at the time) a few days ago, and has now landed on mozilla-beta (which is now FF14).  I think that's all I need to do, please let me know if that's wrong.

(Hmm, should the target milestone be changed from mozilla15 to mozilla14?)
Comment 22 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2012-06-04 16:29:13 PDT
I think the target milestone is fine (I've seen different people say different things about whether it should track initial landing versus followup), but you should set the "affected" flags as appropriate.
Comment 23 Simona B [:simonab] 2012-06-28 02:54:03 PDT
Mozilla/5.0 (Windows NT 6.1; rv:14.0) Gecko/20100101 Firefox/14.0

When trying to reproduce the original crash I got a different signature then the ones related to this bug:
https://crash-stats.mozilla.com/report/index/bp-e1dee497-88ad-4b8a-80d8-139822120628

I also found a crash in Socorro that happened after the patch for mozilla-beta landed:
https://crash-stats.mozilla.com/report/index/3b01e2d7-6bf7-49ca-913a-a055a2120618

But I could not reproduce the crash on the latest Nightly (Firefox 16.0a1) nor the latest beta (Firefox 14 beta 9).

Is this enough to set this as Verified on Firefox 14?
Comment 25 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-08-17 09:49:11 PDT
(In reply to Simona B [QA] from comment #24)
> Please advise, are this crashes related with this bug?

They certainly seem to be. The most recent version in those reports is Firefox 15.0b4 which would certainly indicate this might not be fixed, or that there's a different code-path triggering this signature. Can someone from Engineering or Crashkill look into this?
Comment 26 Nicholas Nethercote [:njn] (on vacation until July 11) 2012-08-19 17:53:49 PDT
The original problem was an obvious defect, easily triggered, and had a simple fix.  Can we please open a new bug for the remaining crashes?
Comment 27 Simona B [:simonab] 2012-08-23 01:16:40 PDT
Filed Bug 785009. 

Based on the above comments, setting tracking flags for Firefox 14 and 15 to verified.

Thank you Nicholas for looking over these crashes.

Note You need to log in before you can comment on or make changes to this bug.