Last Comment Bug 601102 - Crash in exn_trace [@ js::Shape::trace(JSTracer*) ][@ js::Shape::trace ] mainly with TestPilot 1.1 and below
: Crash in exn_trace [@ js::Shape::trace(JSTracer*) ][@ js::Shape::trace ] main...
Status: RESOLVED FIXED
[sg:critical?][fx4-rc-ridealong]
: crash, regression, topcrash
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: Trunk
: All All
: -- critical (vote)
: ---
Assigned To: Luke Wagner [:luke]
:
Mentors:
Depends on:
Blocks: 653081
  Show dependency treegraph
 
Reported: 2010-10-01 05:11 PDT by Scoobidiver (away)
Modified: 2011-06-13 10:01 PDT (History)
14 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
Macaw+
.1-fixed
unaffected
unaffected


Attachments
patch (1.61 KB, patch)
2011-03-25 14:38 PDT, Luke Wagner [:luke]
gal: review+
dveditz: approval2.0+
Details | Diff | Splinter Review

Description Scoobidiver (away) 2010-10-01 05:11:17 PDT
Build : Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b7pre) Gecko/20100930 Firefox/4.0b7pre

This is a new crash signature that was introduced by b7pre/20100929 build.
It happens only under Windows XP.
It is #20 top crasher for b7pre/20100930 build.

Signature	js::Shape::trace(JSTracer*)
UUID	61d6d1b5-c6bf-4cff-be12-4c65d2100930
Time 	2010-09-30 22:43:46.779302
Uptime	30
Last Crash	36 seconds before submission
Install Age	52464 seconds (14.6 hours) since version was first installed.
Product	Firefox
Version	4.0b7pre
Build ID	20100930041305
Branch	2.0
OS	Windows NT
OS Version	5.1.2600 Service Pack 2
CPU	x86
CPU Info	GenuineIntel family 15 model 6 stepping 4
Crash Reason	EXCEPTION_ACCESS_VIOLATION_WRITE
Crash Address	0x1e
App Notes 	AdapterVendorID: 8086, AdapterDeviceID: 29c2

Frame 	Module 	Signature [Expand] 	Source
0 	mozjs.dll 	js::Shape::trace 	js/src/jsscope.cpp:1358
1 	mozjs.dll 	fun_trace 	js/src/jsfun.cpp:2075
2 	mozjs.dll 	js_TraceObject 	js/src/jsobj.cpp:6168
3 	mozjs.dll 	js::gc::MarkChildren 	js/src/jsgcinlines.h:199
4 	mozjs.dll 	js::gc::MarkObject 	js/src/jsgcinlines.h:179
5 	mozjs.dll 	fun_trace 	js/src/jsfun.cpp:2060
6 	mozjs.dll 	js_TraceObject 	js/src/jsobj.cpp:6168

The regression range is :
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=c257bfb8cad0&tochange=a60414d076b5

More reports at :
http://crash-stats.mozilla.com/report/list?product=Firefox&query_search=signature&query_type=exact&query=&range_value=4&range_unit=weeks&hang_type=any&process_type=any&plugin_field=&plugin_query_type=&plugin_query=&do_query=1&admin=&signature=js%3A%3AShape%3A%3Atrace%28JSTracer*%29
Comment 1 Scoobidiver (away) 2010-12-12 23:52:47 PST
It is #23 top crasher on 4.0b8pre for the last 3 days.
Comment 2 Scoobidiver (away) 2011-01-08 01:02:52 PST
It is #12 top crasher in 4.0b8 for the last week.
Comment 3 Scoobidiver (away) 2011-02-21 02:31:23 PST
It is #21 top crasher in 4.0b11.
Comment 4 Scoobidiver (away) 2011-03-03 06:29:13 PST
It is #11 top crasher in 4.0b12.
Comment 5 Scoobidiver (away) 2011-03-05 05:11:48 PST
Here are some comments:
"playing universal war( 3 tabs required ) and blood wars"
"i don't know. i have closed the monitor and went to sleep. this morning found the crash report window."
"Escrevendo um e-mail no gmail.com"
Comment 6 Sheila Mooney 2011-03-22 09:00:32 PDT
Interesting that this signature seems to have climbed in early FF4 returns. It has been the #2 browser crash for the past couple of days. We will watch this one and see what happens as the # of users climb.
Comment 7 Robert Kaiser 2011-03-23 13:26:39 PDT
This spiked very significantly on FF4 release day (yesterday), together with bug 615098 and bug 643746 (all three are in JS and started spiking on 4.0* the day before, so at least the spike cause might be related), which together account for 2800 crashes in a million 4.0* ADU on that day or about 7.4% of all 4.0* crashes on that day, all three are in the top ten top crashers for 4.0* for this day.
Comment 8 Robert Kaiser 2011-03-23 14:20:27 PDT
93% of these crashes have testpilot installed, while the average is at 32%, not sure if this correlation is coincidence or not.
Comment 9 Michael Kraft [:morac] 2011-03-24 13:09:51 PDT
I just got hit by this today when clicking check boxes on a survey page.

Isn't Test Pilot installed in Firefox 4 by default so wouldn't most people have Test Pilot installed after upgrading to Firefox 4?
Comment 10 Robert Kaiser 2011-03-24 13:17:02 PDT
(In reply to comment #9)
> Isn't Test Pilot installed in Firefox 4 by default so wouldn't most people have
> Test Pilot installed after upgrading to Firefox 4?

It was part of the product in betas, but not in final. So everyone that had the beta will have it installed but everyone who only got final hasn't (unless they installed it manually).
Comment 11 Michael Kraft [:morac] 2011-03-24 14:00:42 PDT
(In reply to comment #10)
> It was part of the product in betas, but not in final. So everyone that had the
> beta will have it installed but everyone who only got final hasn't (unless they
> installed it manually).

Ah okay.  Strangely enough I got another of these crashes.  That's two in one day, when I've never seen it before today.  

bp-92beee74-1e16-4f34-8c47-04aca2110324
Comment 12 Robert Kaiser 2011-03-24 14:34:31 PDT
(In reply to comment #11)
> Ah okay.  Strangely enough I got another of these crashes.  That's two in one
> day, when I've never seen it before today.  
> 
> bp-92beee74-1e16-4f34-8c47-04aca2110324

We are currently unsure what causes this, but we need to get some tracking on it. You also appear to have testpilot activated, which right now is our only (weak) clue.
Comment 13 chris hofmann 2011-03-25 10:21:47 PDT
jono, any updates or thoughts from looking at test pilot side?

dmaindelin, any ideas on how to mitigate the crash on the JS side, or help in understanding what addons might be doing?
Comment 14 David Mandelin [:dmandelin] 2011-03-25 13:40:43 PDT
Bill, Luke, and I were talking about this morning. We think it may be due to a compartment mismatch. An exception object keeps a list of argument values from the JS stack frames live at the point of the exception. The argument values may belong to different compartments from the exception, but they are not wrapped, which seems like a bug.
Comment 16 Luke Wagner [:luke] 2011-03-25 14:28:37 PDT
Woohoo, the assert hits, only when I install test pilot, and then the offending frame's script is indeed resource://testpilot/modules/lib/securable-module.js -> resource://testpilot/modules/lib/memory.js.
Comment 17 Luke Wagner [:luke] 2011-03-25 14:38:44 PDT
Created attachment 521956 [details] [diff] [review]
patch

The fix is simple.
Comment 18 Jono Xia 2011-03-25 15:14:09 PDT
Wow, sounds like that was some impressive detective work figuring out the cause of the crash.

Judging by Luke's patch, it seems like this has to be fixed in the core, and so there isn't anything that needs to change in Test Pilot, is that correct?
Comment 19 Luke Wagner [:luke] 2011-03-25 15:17:35 PDT
Correct; Test Pilot just tickled the dragon.
Comment 20 chris hofmann 2011-03-25 17:05:38 PDT
topcrash + security risk?  means it really needs to get out in 4.0.1

marked this as ride-along to get on the radar since I'm not quite sure how to mark this kinds of thinks that should have blocked RC if we had known about them earlier.

what's the sg: rating?
Comment 21 Andreas Gal :gal 2011-03-25 17:32:19 PDT
Comment on attachment 521956 [details] [diff] [review]
patch

If the current compartment is privileged (chrome), it might be allow to see the content frame. Maybe we should do a follow-up and copy data into the destination compartment in that case.
Comment 22 Luke Wagner [:luke] 2011-03-25 18:27:28 PDT
http://hg.mozilla.org/mozilla-central/rev/9be05c62b6ce

If nightlies show a good reduction in this crash (and/or bug 615098, bug 615098), should this land on the 2.0?
Comment 23 chris hofmann 2011-03-25 19:21:39 PDT
looks like the volume on mozilla-central / 4.0 "pre" releases is extremely low (like 0-2 crashes per day).  so we might need a beta or final release to really know how much impact the fix might have.

for example, here is the volume of crashes per release version in crash data that we got yesterday.

20110324 9309 total crashes

date     total    breakdown by build
                  count build, count build, ...

                 	8487 4.02011031805, 
        		412 4.0b122011022221, 	331 4.02011030319, 
        		20 4.0b102011012116, 	14 4.0b112011020314, 
        		8 4.2a1pre2011032316, 	7 4.0b92011011019, 
        		7 4.0b13pre2011032203, 	6 4.0b82010121417, 
        		3 4.0b72010110414, 	3 4.0b13pre2011032103, 
        		3 4.02011031716, 	2 4.2a1pre2011032314, 
        		2 4.2a1pre2011032307, 	2 4.0b13pre2011032303, 
        		1 4.2a1pre2011032403, 	1 4.0b13pre2011032003,
Comment 24 Luke Wagner [:luke] 2011-03-25 20:37:58 PDT
Oops, comment 22 clearly did not see comment 20.
Comment 25 Daniel Veditz [:dveditz] 2011-03-30 11:05:29 PDT
Guessing a missing compartment check might be sg:high, but that's just a guess. Would it allow content/chrome mixing?
Comment 26 Luke Wagner [:luke] 2011-03-30 15:01:40 PDT
The pre-existing security check in stack walking should prevent cross-origin mixing from happening directly.  However, we are still accessing trash memory, so it seems to be in the theoretically-possible-to-exploit category.
Comment 27 Daniel Veditz [:dveditz] 2011-03-31 11:21:18 PDT
because it's completely unclear, "blocking2.0: Macaw" means we want this fix on the mozilla2.0 repository for the 4.0.1 release. Consider that repo equivalent to the end-game Fx4 release and request explicit approval2.0 on the appropriate patch.
Comment 28 Daniel Veditz [:dveditz] 2011-03-31 11:22:06 PDT
Comment on attachment 521956 [details] [diff] [review]
patch

Approved for landing on mozilla2.0, a=dveditz for release-drivers

forgive the previous boilerplate comment
Comment 30 christian 2011-04-13 16:59:14 PDT
We don't see this signature on branches, marking unaffected for now.
Comment 31 Scoobidiver (away) 2011-04-19 11:35:14 PDT
It is #9 top crasher in 4.0.1, so I reopen this bug.
Comment 32 Robert Kaiser 2011-04-19 13:04:50 PDT
I think it might be better to file another one for the remaining things, unless you know that the patch in here doesn't fix any of the crashes we've been seeing with this. Real work has been done here, and the cause supposed to be fixed by the patches in here was deemed a security risk. This bug should cover that class of crashes fixed by this patch. A new bug should be filed for any other class of crashes that might result in the same signature.
Comment 33 Scoobidiver (away) 2011-04-19 14:07:19 PDT
As it became from #2 to #9 top crasher, the patch has fixed some causes. They should be written in the summary, so that the bug isn't confused with the meta bug.
Comment 34 John J. Barton 2011-04-21 17:11:25 PDT
When is this fix going to be available? These crashes are tedious.
Comment 35 Bill McCloskey (:billm) 2011-04-21 17:24:38 PDT
(In reply to comment #34)
> When is this fix going to be available? These crashes are tedious.

It should be fixed in 4.0.1. The fix is only for crashes with exn_trace in the stack trace.
Comment 36 John J. Barton 2011-04-21 22:02:51 PDT
Fine. When is 4.0.1 going to be available. I crashed again.

Note You need to log in before you can comment on or make changes to this bug.