Last Comment Bug 737780 - reproducible crash in js::GetNameFromBytecode with home-snippet-server.xpi
: reproducible crash in js::GetNameFromBytecode with home-snippet-server.xpi
Status: RESOLVED FIXED
startupcrash [qa!]
: crash, regression, reproducible, topcrash
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: 13 Branch
: All Windows 7
: -- critical (vote)
: mozilla14
Assigned To: David Mandelin [:dmandelin]
:
: Jason Orendorff [:jorendorff]
Mentors:
: 742631 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-03-21 02:19 PDT by Scoobidiver (away)
Modified: 2012-04-26 14:57 PDT (History)
15 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
+
verified


Attachments
Diagnostic patch (3.33 KB, patch)
2012-04-02 11:32 PDT, David Mandelin [:dmandelin]
luke: review+
Details | Diff | Splinter Review
Possible fix (1.15 KB, patch)
2012-04-02 19:07 PDT, David Mandelin [:dmandelin]
luke: review+
Details | Diff | Splinter Review
Possible fix v2 (1.19 KB, patch)
2012-04-02 19:09 PDT, David Mandelin [:dmandelin]
luke: review+
akeybl: approval‑mozilla‑aurora+
Details | Diff | Splinter Review

Description Scoobidiver (away) 2012-03-21 02:19:34 PDT
It's #9 top crasher in 13.0a2.
It first appeared in 13.0a1/20120225 but then it's discontinuous across builds.

Signature 	js::GetNameFromBytecode More Reports Search
UUID	849b933f-a259-4b2b-b71b-1ce582120321
Date Processed	2012-03-21 06:24:23
Uptime	17
Last Crash	5.1 hours before submission
Install Age	5.5 hours since version was first installed.
Install Time	2012-03-21 00:56:44
Product	Firefox
Version	13.0a2
Build ID	20120320042012
Release Channel	aurora
OS	Windows NT
OS Version	6.1.7601 Service Pack 1
Build Architecture	x86
Build Architecture Info	AuthenticAMD family 16 model 6 stepping 3
Crash Reason	EXCEPTION_ACCESS_VIOLATION_READ
Crash Address	0x3c
App Notes 	
AdapterVendorID: 0x10de, AdapterDeviceID: 0x03d0, AdapterSubsysID: 2a99103c, AdapterDriverVersion: 8.17.11.9739
EMCheckCompatibility	True
Total Virtual Memory	2147352576
Available Virtual Memory	1928634368
System Memory Use Percentage	89
Available Page File	1094524928
Available Physical Memory	100573184

Frame 	Module 	Signature [Expand] 	Source
0 	mozjs.dll 	js::GetNameFromBytecode 	js/src/jsopcodeinlines.h:59
1 	mozjs.dll 	js::PropertyCache::fullTest 	js/src/jspropertycache.cpp:246
2 	mozjs.dll 	js::NameOperation 	js/src/jsinterpinlines.h:384
3 	mozjs.dll 	js::Interpret 	js/src/jsinterp.cpp:2819
4 	mozjs.dll 	js::RunScript 	js/src/jsinterp.cpp:461
5 	mozjs.dll 	js::ExecuteKernel 	js/src/jsinterp.cpp:668
6 	mozjs.dll 	js::Execute 	js/src/jsinterp.cpp:710
7 	mozjs.dll 	JS_ExecuteScript 	js/src/jsapi.cpp:5275
8 	xul.dll 	nsJSContext::ExecuteScript 	dom/base/nsJSEnvironment.cpp:1591
9 	xul.dll 	nsXULDocument::ExecuteScript 	content/xul/document/src/nsXULDocument.cpp:3613
10 	xul.dll 	nsXULDocument::ExecuteScript 	content/xul/document/src/nsXULDocument.cpp:3634
11 	xul.dll 	nsXULDocument::OnStreamComplete 	content/xul/document/src/nsXULDocument.cpp:3506
12 	xul.dll 	nsStreamLoader::OnStopRequest 	netwerk/base/src/nsStreamLoader.cpp:127
13 	xul.dll 	nsBaseChannel::OnStopRequest 	netwerk/base/src/nsBaseChannel.cpp:745
14 	xul.dll 	nsInputStreamPump::OnStateStop 	netwerk/base/src/nsInputStreamPump.cpp:583
15 	xul.dll 	nsInputStreamPump::OnInputStreamReady 	netwerk/base/src/nsInputStreamPump.cpp:405
16 	xul.dll 	nsInputStreamReadyEvent::Run 	xpcom/io/nsStreamUtils.cpp:114
17 	xul.dll 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:657
18 	xul.dll 	mozilla::ipc::MessagePump::Run 	ipc/glue/MessagePump.cpp:110
19 	xul.dll 	MessageLoop::RunHandler 	ipc/chromium/src/base/message_loop.cc:201
20 	xul.dll 	MessageLoop::Run 	ipc/chromium/src/base/message_loop.cc:175
21 	xul.dll 	nsBaseAppShell::Run 	widget/xpwidgets/nsBaseAppShell.cpp:189
22 	xul.dll 	nsAppShell::Run 	widget/windows/nsAppShell.cpp:252
23 	xul.dll 	nsAppStartup::Run 	toolkit/components/startup/nsAppStartup.cpp:295
24 	xul.dll 	XRE_main 	toolkit/xre/nsAppRunner.cpp:3703
25 	firefox.exe 	wmain 	toolkit/xre/nsWindowsWMain.cpp:107
...

More reports at:
https://crash-stats.mozilla.com/report/list?signature=js%3A%3AGetNameFromBytecode
Comment 1 Alex Keybl [:akeybl] 2012-03-21 14:59:32 PDT
I see ~50% of crashes are within a minute. Marking as a startup crasher. Can we run correlation reports or grab URLs to try to find a lead for the JS guys?
Comment 2 David Mandelin [:dmandelin] 2012-03-23 16:29:59 PDT
I looked at checkins before the spike and before 2/25, but nothing jumped out at me. I think it's crashing here:

#define GET_ATOM_FROM_BYTECODE(script, pc, pcoff, atom)                       \
    JS_BEGIN_MACRO                                                            \
        JS_ASSERT(js_CodeSpec[*(pc)].format & JOF_ATOM);                      \
        (atom) = (script)->getAtom(GET_UINT32_INDEX((pc) + (pcoff)));         \
    JS_END_MACRO

on (script)->getAtom, with an NPE (or near-null pointer) on script->atoms. This is all in the property cache, which is probably going out someday anyway. I'm gonna watch a few more days, and if it stays more common, we can try some diagnostics.
Comment 3 Marcia Knous [:marcia - use ni] 2012-03-23 17:09:37 PDT
Some URLs from March:

    30 
      6 \N
      3 about:home
      2 http://www.facebook.com/
      2 http://service.js.10086.cn/obsh.html#ZDCX
      1 http://zh-tw.justin.tv/ladelfin4/w/994004272/1
      1 http://www.youtube.com/
      1 http://www.veiligbankieren.nl/nl/test.html
      1 http://www.msn.com/?rd=1&ucc=JP&dcc=US&opt=0&st=2
      1 http://www.lemonde.fr/
      1 http://www.kongregate.com/games/BenSpyda/lethalrpgdestiny-2-conquest
      1 http://www.iraqna1.com/vb/t36521.html
      1 http://www.google.com/search?q=how+redisplay+desktop+icons&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:unofficial&client=firefox-aurora
      1 http://www.google.com/search?q=funny+desktop+wallpaper&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:unofficial&client=firefox-aurora
      1 http://www.google.com.gt/#hl=es-419&output=search&sclient=psy-ab&q=computrabajo&oq=comp&aq=0&aqi=g10&aql=&gs_sm=1&gs_upl=31557l33838l1l35572l4l4l0l0l0l0l282l907l2-4l4l0&gs_l=hp.1.0.0l10.31557l33838l1l35573l4l4l0l0l0l0l282l907l2-4l4l0&psj=1&bav=on.2,or.r_g
      1 http://www.google.com/cse?cx=002443141534113389537%3Aysdmevkkknw&cof=FORID%3A0&q=languages&x=0&y=0#gsc.tab=0&gsc.q=languages%20nightly
      1 https://www.facebook.com/login.php?login_attempt=1
      1 https://www.facebook.com/
      1 https://www4.zav-triglav.si/eprijava/prijava_sp/skode_sp_1.jsp
      1 https://mail.google.com/mail/?shva=1#inbox/136111a0b2882747
      1 https://mail.google.com/mail/?shva=1#inbox
      1 https://mail.google.com/mail/#inbox
      1 http://slunecnice.cz/
      1 https://affiliates.mozilla.org/en-US/
      1 https://addons.mozilla.org/en-US/firefox/extensions/download-management/?sort=users
      1 http://s2.sfgame.de/
      1 http://register.net.id/
      1 http://grooveshark.com/#!/search?q=lady+antebellum
      1 http://en26.grepolis.com/game/
      1 http://e.mail.ru/cgi-bin/login
      1 http://devel3.amos-cms.com/admin/backend#
      1 http://blog.com/
      1 about:sessionrestore
      1 about:blank
Comment 4 Scoobidiver (away) 2012-03-27 05:22:57 PDT
It's now #20 top browser crasher in 13.0a2.

Here are fresh correlations per extension:
* March 24:
  js::GetNameFromBytecode|EXCEPTION_ACCESS_VIOLATION_READ (25 crashes)
     68% (17/25) vs.   1% (20/2507) browseforchange@browseforchange.com
     48% (12/25) vs.   2% (46/2507) plugin@yontoo.com
     44% (11/25) vs.   0% (11/2507) {46d606b0-a645-11df-981c-0800200c9a66} (Shop to Win)
     48% (12/25) vs.   6% (142/2507) ffxtlbr@babylon.com
* March 26:
  js::GetNameFromBytecode|EXCEPTION_ACCESS_VIOLATION_READ (12 crashes)
     25% (3/12) vs.   0% (7/1994) browseforchange@browseforchange.com
     25% (3/12) vs.   1% (19/1994) plugin@yontoo.com
* March 27:
  js::GetNameFromBytecode|EXCEPTION_ACCESS_VIOLATION_READ (21 crashes)
     62% (13/21) vs.   1% (17/2726) browseforchange@browseforchange.com
     43% (9/21) vs.   2% (55/2726) plugin@yontoo.com
     33% (7/21) vs.   0% (12/2726) {5a95a9e0-59dd-4314-bd84-4d18ca83a0e2} (Wajam)
     29% (6/21) vs.   6% (169/2726) ffxtlbr@babylon.com

Based on correlations and the stack, it's the new form of bug 700176 that dates back to Fx 9.0.
Comment 5 Stephen Donner [:stephend] 2012-03-27 17:25:53 PDT
I just hit this with today's Win32 nightly on Windows XP, and I have a screencast showing what happened: 

http://screencast.com/t/eCFx54CIAm

STR:

1. Downloaded a Windows nightly for XP
2. Installed the home-snippet-server.xpi from http://cl.ly/153O231Y0J1C10182F1X
3. Switched the prod URL to staging in the snippet-switcher, then clicked the "Update and Restart" button, and crashed

Incident is: https://crash-stats.mozilla.com/report/index/bp-374619d6-edcd-4837-b579-be3592120328
Comment 6 Scoobidiver (away) 2012-03-27 23:08:14 PDT
I can't reproduce in Nightly on Windows 7 with the STR in comment 5.
I guess the STR are specific to Windows XP.
Comment 7 David Mandelin [:dmandelin] 2012-03-30 18:28:47 PDT
(In reply to Stephen Donner [:stephend] from comment #5)
> I just hit this with today's Win32 nightly on Windows XP, and I have a
> screencast showing what happened: 
> 
> http://screencast.com/t/eCFx54CIAm
> 
> STR:
> 
> 1. Downloaded a Windows nightly for XP
> 2. Installed the home-snippet-server.xpi from
> http://cl.ly/153O231Y0J1C10182F1X
> 3. Switched the prod URL to staging in the snippet-switcher, then clicked
> the "Update and Restart" button, and crashed
> 
> Incident is:
> https://crash-stats.mozilla.com/report/index/bp-374619d6-edcd-4837-b579-
> be3592120328

I don't have an "Update and Restart" button. Do you mean "Change Update URL"? I tried that and it didn't crash. I also have Windows 7, not XP.

release-drivers: Luke had an idea to make it stop crashing, which should totally work, but doesn't fix the underlying bug. We think it's probably a compartment mismatch, which could still cause crashes later on.

Otherwise, we can start landing diagnostic patches and/or have people try to figure out STR. That will take longer but I think we'll be able to figure it out eventually. 

I'm inclined to just work on the real fix. If you think we need a temporary fix faster, let me know. Keep in mind that we do need to crash on this condition in order to get data to fix the underlying bug.
Comment 8 Stephen Donner [:stephend] 2012-03-30 18:34:20 PDT
(In reply to David Mandelin from comment #7)

> I don't have an "Update and Restart" button. Do you mean "Change Update
> URL"? I tried that and it didn't crash. I also have Windows 7, not XP.

Yes, sorry, this.  This crash is really reproducible in my Windows XP VM, for whatever reason; can show it to you next week.
Comment 9 David Mandelin [:dmandelin] 2012-03-30 18:39:11 PDT
(In reply to Stephen Donner [:stephend] from comment #8)
> (In reply to David Mandelin from comment #7)
> 
> > I don't have an "Update and Restart" button. Do you mean "Change Update
> > URL"? I tried that and it didn't crash. I also have Windows 7, not XP.
> 
> Yes, sorry, this.  This crash is really reproducible in my Windows XP VM,
> for whatever reason; can show it to you next week.

That would be great. Luke thinks it will probably be pretty easy to see from a reproducible test case.
Comment 10 David Mandelin [:dmandelin] 2012-04-02 11:32:55 PDT
Created attachment 611534 [details] [diff] [review]
Diagnostic patch

The reproducible test case would be great to see, but in the meantime, we can start by confirming that |script| is null and finding out which patch is making it null.
Comment 11 David Mandelin [:dmandelin] 2012-04-02 13:22:37 PDT
Diagnostic landed:
http://hg.mozilla.org/integration/mozilla-inbound/rev/d272cfd24b53
Comment 12 David Mandelin [:dmandelin] 2012-04-02 19:07:45 PDT
Created attachment 611688 [details] [diff] [review]
Possible fix

This is a fix for the reproducible test case. That case is a compartment mismatch error, which we suspected originally. It happens because LoadFrameScriptInternal tries to use a cx and a global together that aren't necessarily in the same compartment.
Comment 13 Luke Wagner [:luke] 2012-04-02 19:09:00 PDT
Comment on attachment 611688 [details] [diff] [review]
Possible fix

need to handle (!ac.enter(...))
Comment 14 David Mandelin [:dmandelin] 2012-04-02 19:09:55 PDT
Created attachment 611689 [details] [diff] [review]
Possible fix v2
Comment 15 Luke Wagner [:luke] 2012-04-02 19:12:12 PDT
Comment on attachment 611689 [details] [diff] [review]
Possible fix v2

Hah.  That is a goofy function.
Comment 16 David Mandelin [:dmandelin] 2012-04-02 19:14:08 PDT
http://hg.mozilla.org/integration/mozilla-inbound/rev/f43822091ed1

Leaving the diagnostic in place for now in case this is different from the topcrash.

Strangely, Stephen's test case now sometimes just exits the browser. Not sure why, but my build has lots of JS assertions turned on and it doesn't seem to be tripping anything in there, so it could be something else.
Comment 17 Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2012-04-02 22:57:16 PDT
Seems odd that we end up with an mCx and an mGlobal that are in different compartments.  Is this expected, Olli?
Comment 18 Olli Pettay [:smaug] 2012-04-03 00:30:44 PDT
That is odd. We first create mCx and then using that as a parameter we call
nsIXPConnect::InitClassesWithNewWrappedGlobal which creates mGlobal. I don't know how that
could lead mCx and mGlobal's JSObject to live in different compartments.
Comment 19 Marco Bonardo [::mak] 2012-04-03 02:05:27 PDT
https://hg.mozilla.org/mozilla-central/rev/d272cfd24b53
Comment 20 Luke Wagner [:luke] 2012-04-03 09:31:12 PDT
(In reply to Olli Pettay [:smaug] from comment #18)
bholley may know
Comment 21 Matt Brubeck (:mbrubeck) 2012-04-03 10:55:50 PDT
https://hg.mozilla.org/mozilla-central/rev/f43822091ed1
Comment 22 Bobby Holley (:bholley) (busy with Stylo) 2012-04-03 10:57:03 PDT
(In reply to Olli Pettay [:smaug] from comment #18)
> I don't know how that
> could lead mCx and mGlobal's JSObject to live in different compartments.

It certainly can. In fact, it always happens that way. When entering the call to InitClassesWithNewWrappedGlobal, cx is in the compartment of the caller. Unless the new global is same-origin with the caller, the global will end up in a different compartment. After compartment-per-global, it _always_ will.

Relevant code:
http://mxr.mozilla.org/mozilla-central/source/js/xpconnect/src/nsXPConnect.cpp#1206
http://mxr.mozilla.org/mozilla-central/source/js/xpconnect/src/XPCWrappedNative.cpp#379

xpc_CreateGlobalObject
Comment 23 Stephen Donner [:stephend] 2012-04-04 23:16:23 PDT
I can still reproduce the "clean exit" David refers to in comment 16, using Mozilla/5.0 (Windows NT 5.1; rv:14.0) Gecko/20120404 Firefox/14.0a1.  I can attest that there's indeed not a crash -- do I need to file a new bug?  I can still reproduce the clean exit 100%, using the same steps as the crash.
Comment 24 David Mandelin [:dmandelin] 2012-04-05 11:05:49 PDT
(In reply to Stephen Donner [:stephend] from comment #23)
> I can still reproduce the "clean exit" David refers to in comment 16, using
> Mozilla/5.0 (Windows NT 5.1; rv:14.0) Gecko/20120404 Firefox/14.0a1.  I can
> attest that there's indeed not a crash -- do I need to file a new bug?  I
> can still reproduce the clean exit 100%, using the same steps as the crash.

I think it is a new bug. 

This one looks like it's stopped on trunk, but we're still getting them from Aurora builds.
Comment 25 Stephen Donner [:stephend] 2012-04-05 19:29:19 PDT
(In reply to David Mandelin from comment #24)

> I think it is a new bug. 
> 
> This one looks like it's stopped on trunk, but we're still getting them from
> Aurora builds.

Filed bug 743140.
Comment 26 Lukas Blakk [:lsblakk] use ?needinfo 2012-04-09 15:53:42 PDT
(In reply to David Mandelin from comment #24)

> This one looks like it's stopped on trunk, but we're still getting them from
> Aurora builds.

Please nominate for Aurora 13 uplift as soon as possible for this regression.
Comment 27 David Mandelin [:dmandelin] 2012-04-09 17:26:51 PDT
Comment on attachment 611689 [details] [diff] [review]
Possible fix v2

[Approval Request Comment]
Regression caused by (bug #): unknown
User impact if declined: crashes as observed in Socorro
Testing completed (on m-c, etc.): running on m-c for a week
Risk to taking this patch (and alternatives if risky): none that I know of
String changes made by this patch:
Comment 28 Alex Keybl [:akeybl] 2012-04-10 12:16:15 PDT
Comment on attachment 611689 [details] [diff] [review]
Possible fix v2

[Triage Comment]
Low risk fix for a top crasher - approved for Aurora 13.
Comment 29 David Mandelin [:dmandelin] 2012-04-10 16:57:53 PDT
http://hg.mozilla.org/releases/mozilla-aurora/rev/d77279ec6acc
Comment 30 David Mandelin [:dmandelin] 2012-04-12 11:29:00 PDT
*** Bug 742631 has been marked as a duplicate of this bug. ***
Comment 31 Marco Zehe (:MarcoZ) 2012-04-13 08:32:12 PDT
Did this land correctly on Aurora? Because a community member reported to me a crash in the April 12 build:
https://crash-stats.mozilla.com/report/index/bp-ec7a0a20-960c-4cc1-b847-499852120413
Comment 32 David Mandelin [:dmandelin] 2012-04-16 18:49:46 PDT
(In reply to Marco Zehe (:MarcoZ) from comment #31)
> Did this land correctly on Aurora? Because a community member reported to me
> a crash in the April 12 build:
> https://crash-stats.mozilla.com/report/index/bp-ec7a0a20-960c-4cc1-b847-
> 499852120413

That is actually a different crash stack from the bug I fixed (although it probably is somewhat related, so you're right to post it here). Also, I see that some crashes of this type are happening on nightly, but with a diagnostic signature. I'll file a new bug.
Comment 33 David Mandelin [:dmandelin] 2012-04-16 18:56:21 PDT
Filed new bug 746036.
Comment 34 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-04-26 13:20:51 PDT
Can someone please point me to the reproducible case for this crash?
Comment 35 Stephen Donner [:stephend] 2012-04-26 13:22:05 PDT
(In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #34)
> Can someone please point me to the reproducible case for this crash?

It's https://bugzilla.mozilla.org/show_bug.cgi?id=737780#c5
Comment 36 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-04-26 13:29:24 PDT
(In reply to Stephen Donner [:stephend] from comment #35)
> (In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #34)
> > Can someone please point me to the reproducible case for this crash?
> 
> It's https://bugzilla.mozilla.org/show_bug.cgi?id=737780#c5

Comment 6 seems to refute that -- can you confirm? Better yet, can you verify this is fixed in Firefox 13.0b1?
Comment 37 David Mandelin [:dmandelin] 2012-04-26 13:50:49 PDT
(In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #36)
> (In reply to Stephen Donner [:stephend] from comment #35)
> > (In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #34)
> > > Can someone please point me to the reproducible case for this crash?
> > 
> > It's https://bugzilla.mozilla.org/show_bug.cgi?id=737780#c5
> 
> Comment 6 seems to refute that -- can you confirm? Better yet, can you
> verify this is fixed in Firefox 13.0b1?

Comment 5 is the correct test case. I fixed this bug specifically by using it. It doesn't repro 100% of the time.
Comment 38 Stephen Donner [:stephend] 2012-04-26 14:53:55 PDT
I can no longer reproduce this using the testcase I had from comment 5, which lead to the fix for this signature: http://screencast.com/t/bza8wzVMkK
     
I _can_ however, reproduce the same bug 743140, which means the fix from nightlies made it safely to 13.0b1.
     
Adding [qa!], status-firefox13:verified to the whiteboard.
     
Build ID is: Mozilla/5.0 (Windows NT 5.1; rv:13.0) Gecko/20100101 Firefox/13.0, which I got from ftp://ftp.mozilla.org/pub/firefox/candidates/13.0b1-candidates/build1/win32/en-US/Firefox%20Setup%2013.0b1.exe
Comment 39 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-04-26 14:57:38 PDT
(In reply to Stephen Donner [:stephend] from comment #38)
> Adding [qa!], status-firefox13:verified to the whiteboard.

Thanks Stephen. Sorry I wasn't clear on IRC. status-firefox13:verified is a tracking flag (on the right side). I've fixed it.

Note You need to log in before you can comment on or make changes to this bug.