Closed Bug 737780 Opened 12 years ago Closed 12 years ago

reproducible crash in js::GetNameFromBytecode with home-snippet-server.xpi

Categories

(Core :: JavaScript Engine, defect)

13 Branch
All
Windows 7
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla14
Tracking Status
firefox13 + verified

People

(Reporter: scoobidiver, Assigned: dmandelin)

References

Details

(4 keywords, Whiteboard: startupcrash [qa!])

Crash Data

Attachments

(2 files, 1 obsolete file)

It's #9 top crasher in 13.0a2.
It first appeared in 13.0a1/20120225 but then it's discontinuous across builds.

Signature 	js::GetNameFromBytecode More Reports Search
UUID	849b933f-a259-4b2b-b71b-1ce582120321
Date Processed	2012-03-21 06:24:23
Uptime	17
Last Crash	5.1 hours before submission
Install Age	5.5 hours since version was first installed.
Install Time	2012-03-21 00:56:44
Product	Firefox
Version	13.0a2
Build ID	20120320042012
Release Channel	aurora
OS	Windows NT
OS Version	6.1.7601 Service Pack 1
Build Architecture	x86
Build Architecture Info	AuthenticAMD family 16 model 6 stepping 3
Crash Reason	EXCEPTION_ACCESS_VIOLATION_READ
Crash Address	0x3c
App Notes 	
AdapterVendorID: 0x10de, AdapterDeviceID: 0x03d0, AdapterSubsysID: 2a99103c, AdapterDriverVersion: 8.17.11.9739
EMCheckCompatibility	True
Total Virtual Memory	2147352576
Available Virtual Memory	1928634368
System Memory Use Percentage	89
Available Page File	1094524928
Available Physical Memory	100573184

Frame 	Module 	Signature [Expand] 	Source
0 	mozjs.dll 	js::GetNameFromBytecode 	js/src/jsopcodeinlines.h:59
1 	mozjs.dll 	js::PropertyCache::fullTest 	js/src/jspropertycache.cpp:246
2 	mozjs.dll 	js::NameOperation 	js/src/jsinterpinlines.h:384
3 	mozjs.dll 	js::Interpret 	js/src/jsinterp.cpp:2819
4 	mozjs.dll 	js::RunScript 	js/src/jsinterp.cpp:461
5 	mozjs.dll 	js::ExecuteKernel 	js/src/jsinterp.cpp:668
6 	mozjs.dll 	js::Execute 	js/src/jsinterp.cpp:710
7 	mozjs.dll 	JS_ExecuteScript 	js/src/jsapi.cpp:5275
8 	xul.dll 	nsJSContext::ExecuteScript 	dom/base/nsJSEnvironment.cpp:1591
9 	xul.dll 	nsXULDocument::ExecuteScript 	content/xul/document/src/nsXULDocument.cpp:3613
10 	xul.dll 	nsXULDocument::ExecuteScript 	content/xul/document/src/nsXULDocument.cpp:3634
11 	xul.dll 	nsXULDocument::OnStreamComplete 	content/xul/document/src/nsXULDocument.cpp:3506
12 	xul.dll 	nsStreamLoader::OnStopRequest 	netwerk/base/src/nsStreamLoader.cpp:127
13 	xul.dll 	nsBaseChannel::OnStopRequest 	netwerk/base/src/nsBaseChannel.cpp:745
14 	xul.dll 	nsInputStreamPump::OnStateStop 	netwerk/base/src/nsInputStreamPump.cpp:583
15 	xul.dll 	nsInputStreamPump::OnInputStreamReady 	netwerk/base/src/nsInputStreamPump.cpp:405
16 	xul.dll 	nsInputStreamReadyEvent::Run 	xpcom/io/nsStreamUtils.cpp:114
17 	xul.dll 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:657
18 	xul.dll 	mozilla::ipc::MessagePump::Run 	ipc/glue/MessagePump.cpp:110
19 	xul.dll 	MessageLoop::RunHandler 	ipc/chromium/src/base/message_loop.cc:201
20 	xul.dll 	MessageLoop::Run 	ipc/chromium/src/base/message_loop.cc:175
21 	xul.dll 	nsBaseAppShell::Run 	widget/xpwidgets/nsBaseAppShell.cpp:189
22 	xul.dll 	nsAppShell::Run 	widget/windows/nsAppShell.cpp:252
23 	xul.dll 	nsAppStartup::Run 	toolkit/components/startup/nsAppStartup.cpp:295
24 	xul.dll 	XRE_main 	toolkit/xre/nsAppRunner.cpp:3703
25 	firefox.exe 	wmain 	toolkit/xre/nsWindowsWMain.cpp:107
...

More reports at:
https://crash-stats.mozilla.com/report/list?signature=js%3A%3AGetNameFromBytecode
I see ~50% of crashes are within a minute. Marking as a startup crasher. Can we run correlation reports or grab URLs to try to find a lead for the JS guys?
Whiteboard: startupcrash
I looked at checkins before the spike and before 2/25, but nothing jumped out at me. I think it's crashing here:

#define GET_ATOM_FROM_BYTECODE(script, pc, pcoff, atom)                       \
    JS_BEGIN_MACRO                                                            \
        JS_ASSERT(js_CodeSpec[*(pc)].format & JOF_ATOM);                      \
        (atom) = (script)->getAtom(GET_UINT32_INDEX((pc) + (pcoff)));         \
    JS_END_MACRO

on (script)->getAtom, with an NPE (or near-null pointer) on script->atoms. This is all in the property cache, which is probably going out someday anyway. I'm gonna watch a few more days, and if it stays more common, we can try some diagnostics.
Keywords: needURLs
Some URLs from March:

    30 
      6 \N
      3 about:home
      2 http://www.facebook.com/
      2 http://service.js.10086.cn/obsh.html#ZDCX
      1 http://zh-tw.justin.tv/ladelfin4/w/994004272/1
      1 http://www.youtube.com/
      1 http://www.veiligbankieren.nl/nl/test.html
      1 http://www.msn.com/?rd=1&ucc=JP&dcc=US&opt=0&st=2
      1 http://www.lemonde.fr/
      1 http://www.kongregate.com/games/BenSpyda/lethalrpgdestiny-2-conquest
      1 http://www.iraqna1.com/vb/t36521.html
      1 http://www.google.com/search?q=how+redisplay+desktop+icons&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:unofficial&client=firefox-aurora
      1 http://www.google.com/search?q=funny+desktop+wallpaper&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:unofficial&client=firefox-aurora
      1 http://www.google.com.gt/#hl=es-419&output=search&sclient=psy-ab&q=computrabajo&oq=comp&aq=0&aqi=g10&aql=&gs_sm=1&gs_upl=31557l33838l1l35572l4l4l0l0l0l0l282l907l2-4l4l0&gs_l=hp.1.0.0l10.31557l33838l1l35573l4l4l0l0l0l0l282l907l2-4l4l0&psj=1&bav=on.2,or.r_g
      1 http://www.google.com/cse?cx=002443141534113389537%3Aysdmevkkknw&cof=FORID%3A0&q=languages&x=0&y=0#gsc.tab=0&gsc.q=languages%20nightly
      1 https://www.facebook.com/login.php?login_attempt=1
      1 https://www.facebook.com/
      1 https://www4.zav-triglav.si/eprijava/prijava_sp/skode_sp_1.jsp
      1 https://mail.google.com/mail/?shva=1#inbox/136111a0b2882747
      1 https://mail.google.com/mail/?shva=1#inbox
      1 https://mail.google.com/mail/#inbox
      1 http://slunecnice.cz/
      1 https://affiliates.mozilla.org/en-US/
      1 https://addons.mozilla.org/en-US/firefox/extensions/download-management/?sort=users
      1 http://s2.sfgame.de/
      1 http://register.net.id/
      1 http://grooveshark.com/#!/search?q=lady+antebellum
      1 http://en26.grepolis.com/game/
      1 http://e.mail.ru/cgi-bin/login
      1 http://devel3.amos-cms.com/admin/backend#
      1 http://blog.com/
      1 about:sessionrestore
      1 about:blank
Keywords: needURLs
It's now #20 top browser crasher in 13.0a2.

Here are fresh correlations per extension:
* March 24:
  js::GetNameFromBytecode|EXCEPTION_ACCESS_VIOLATION_READ (25 crashes)
     68% (17/25) vs.   1% (20/2507) browseforchange@browseforchange.com
     48% (12/25) vs.   2% (46/2507) plugin@yontoo.com
     44% (11/25) vs.   0% (11/2507) {46d606b0-a645-11df-981c-0800200c9a66} (Shop to Win)
     48% (12/25) vs.   6% (142/2507) ffxtlbr@babylon.com
* March 26:
  js::GetNameFromBytecode|EXCEPTION_ACCESS_VIOLATION_READ (12 crashes)
     25% (3/12) vs.   0% (7/1994) browseforchange@browseforchange.com
     25% (3/12) vs.   1% (19/1994) plugin@yontoo.com
* March 27:
  js::GetNameFromBytecode|EXCEPTION_ACCESS_VIOLATION_READ (21 crashes)
     62% (13/21) vs.   1% (17/2726) browseforchange@browseforchange.com
     43% (9/21) vs.   2% (55/2726) plugin@yontoo.com
     33% (7/21) vs.   0% (12/2726) {5a95a9e0-59dd-4314-bd84-4d18ca83a0e2} (Wajam)
     29% (6/21) vs.   6% (169/2726) ffxtlbr@babylon.com

Based on correlations and the stack, it's the new form of bug 700176 that dates back to Fx 9.0.
I just hit this with today's Win32 nightly on Windows XP, and I have a screencast showing what happened: 

http://screencast.com/t/eCFx54CIAm

STR:

1. Downloaded a Windows nightly for XP
2. Installed the home-snippet-server.xpi from http://cl.ly/153O231Y0J1C10182F1X
3. Switched the prod URL to staging in the snippet-switcher, then clicked the "Update and Restart" button, and crashed

Incident is: https://crash-stats.mozilla.com/report/index/bp-374619d6-edcd-4837-b579-be3592120328
I can't reproduce in Nightly on Windows 7 with the STR in comment 5.
I guess the STR are specific to Windows XP.
Keywords: reproducible
(In reply to Stephen Donner [:stephend] from comment #5)
> I just hit this with today's Win32 nightly on Windows XP, and I have a
> screencast showing what happened: 
> 
> http://screencast.com/t/eCFx54CIAm
> 
> STR:
> 
> 1. Downloaded a Windows nightly for XP
> 2. Installed the home-snippet-server.xpi from
> http://cl.ly/153O231Y0J1C10182F1X
> 3. Switched the prod URL to staging in the snippet-switcher, then clicked
> the "Update and Restart" button, and crashed
> 
> Incident is:
> https://crash-stats.mozilla.com/report/index/bp-374619d6-edcd-4837-b579-
> be3592120328

I don't have an "Update and Restart" button. Do you mean "Change Update URL"? I tried that and it didn't crash. I also have Windows 7, not XP.

release-drivers: Luke had an idea to make it stop crashing, which should totally work, but doesn't fix the underlying bug. We think it's probably a compartment mismatch, which could still cause crashes later on.

Otherwise, we can start landing diagnostic patches and/or have people try to figure out STR. That will take longer but I think we'll be able to figure it out eventually. 

I'm inclined to just work on the real fix. If you think we need a temporary fix faster, let me know. Keep in mind that we do need to crash on this condition in order to get data to fix the underlying bug.
Assignee: general → dmandelin
(In reply to David Mandelin from comment #7)

> I don't have an "Update and Restart" button. Do you mean "Change Update
> URL"? I tried that and it didn't crash. I also have Windows 7, not XP.

Yes, sorry, this.  This crash is really reproducible in my Windows XP VM, for whatever reason; can show it to you next week.
(In reply to Stephen Donner [:stephend] from comment #8)
> (In reply to David Mandelin from comment #7)
> 
> > I don't have an "Update and Restart" button. Do you mean "Change Update
> > URL"? I tried that and it didn't crash. I also have Windows 7, not XP.
> 
> Yes, sorry, this.  This crash is really reproducible in my Windows XP VM,
> for whatever reason; can show it to you next week.

That would be great. Luke thinks it will probably be pretty easy to see from a reproducible test case.
Attached patch Diagnostic patchSplinter Review
The reproducible test case would be great to see, but in the meantime, we can start by confirming that |script| is null and finding out which patch is making it null.
Attachment #611534 - Flags: review?(luke)
Attachment #611534 - Flags: review?(luke) → review+
Attached patch Possible fix (obsolete) — Splinter Review
This is a fix for the reproducible test case. That case is a compartment mismatch error, which we suspected originally. It happens because LoadFrameScriptInternal tries to use a cx and a global together that aren't necessarily in the same compartment.
Attachment #611688 - Flags: review?(luke)
Comment on attachment 611688 [details] [diff] [review]
Possible fix

need to handle (!ac.enter(...))
Attachment #611688 - Flags: review?(luke) → review+
Attached patch Possible fix v2Splinter Review
Attachment #611688 - Attachment is obsolete: true
Attachment #611689 - Flags: review?(luke)
Comment on attachment 611689 [details] [diff] [review]
Possible fix v2

Hah.  That is a goofy function.
Attachment #611689 - Flags: review?(luke) → review+
http://hg.mozilla.org/integration/mozilla-inbound/rev/f43822091ed1

Leaving the diagnostic in place for now in case this is different from the topcrash.

Strangely, Stephen's test case now sometimes just exits the browser. Not sure why, but my build has lots of JS assertions turned on and it doesn't seem to be tripping anything in there, so it could be something else.
Seems odd that we end up with an mCx and an mGlobal that are in different compartments.  Is this expected, Olli?
That is odd. We first create mCx and then using that as a parameter we call
nsIXPConnect::InitClassesWithNewWrappedGlobal which creates mGlobal. I don't know how that
could lead mCx and mGlobal's JSObject to live in different compartments.
(In reply to Olli Pettay [:smaug] from comment #18)
bholley may know
https://hg.mozilla.org/mozilla-central/rev/f43822091ed1
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla14
(In reply to Olli Pettay [:smaug] from comment #18)
> I don't know how that
> could lead mCx and mGlobal's JSObject to live in different compartments.

It certainly can. In fact, it always happens that way. When entering the call to InitClassesWithNewWrappedGlobal, cx is in the compartment of the caller. Unless the new global is same-origin with the caller, the global will end up in a different compartment. After compartment-per-global, it _always_ will.

Relevant code:
http://mxr.mozilla.org/mozilla-central/source/js/xpconnect/src/nsXPConnect.cpp#1206
http://mxr.mozilla.org/mozilla-central/source/js/xpconnect/src/XPCWrappedNative.cpp#379

xpc_CreateGlobalObject
I can still reproduce the "clean exit" David refers to in comment 16, using Mozilla/5.0 (Windows NT 5.1; rv:14.0) Gecko/20120404 Firefox/14.0a1.  I can attest that there's indeed not a crash -- do I need to file a new bug?  I can still reproduce the clean exit 100%, using the same steps as the crash.
(In reply to Stephen Donner [:stephend] from comment #23)
> I can still reproduce the "clean exit" David refers to in comment 16, using
> Mozilla/5.0 (Windows NT 5.1; rv:14.0) Gecko/20120404 Firefox/14.0a1.  I can
> attest that there's indeed not a crash -- do I need to file a new bug?  I
> can still reproduce the clean exit 100%, using the same steps as the crash.

I think it is a new bug. 

This one looks like it's stopped on trunk, but we're still getting them from Aurora builds.
(In reply to David Mandelin from comment #24)

> I think it is a new bug. 
> 
> This one looks like it's stopped on trunk, but we're still getting them from
> Aurora builds.

Filed bug 743140.
(In reply to David Mandelin from comment #24)

> This one looks like it's stopped on trunk, but we're still getting them from
> Aurora builds.

Please nominate for Aurora 13 uplift as soon as possible for this regression.
Comment on attachment 611689 [details] [diff] [review]
Possible fix v2

[Approval Request Comment]
Regression caused by (bug #): unknown
User impact if declined: crashes as observed in Socorro
Testing completed (on m-c, etc.): running on m-c for a week
Risk to taking this patch (and alternatives if risky): none that I know of
String changes made by this patch:
Attachment #611689 - Flags: approval-mozilla-aurora?
Comment on attachment 611689 [details] [diff] [review]
Possible fix v2

[Triage Comment]
Low risk fix for a top crasher - approved for Aurora 13.
Attachment #611689 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Did this land correctly on Aurora? Because a community member reported to me a crash in the April 12 build:
https://crash-stats.mozilla.com/report/index/bp-ec7a0a20-960c-4cc1-b847-499852120413
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(In reply to Marco Zehe (:MarcoZ) from comment #31)
> Did this land correctly on Aurora? Because a community member reported to me
> a crash in the April 12 build:
> https://crash-stats.mozilla.com/report/index/bp-ec7a0a20-960c-4cc1-b847-
> 499852120413

That is actually a different crash stack from the bug I fixed (although it probably is somewhat related, so you're right to post it here). Also, I see that some crashes of this type are happening on nightly, but with a diagnostic signature. I'll file a new bug.
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
Summary: crash in js::GetNameFromBytecode → reproducible crash in js::GetNameFromBytecode with home-snippet-server.xpi
Filed new bug 746036.
Can someone please point me to the reproducible case for this crash?
Whiteboard: startupcrash → startupcrash [qa?]
(In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #34)
> Can someone please point me to the reproducible case for this crash?

It's https://bugzilla.mozilla.org/show_bug.cgi?id=737780#c5
(In reply to Stephen Donner [:stephend] from comment #35)
> (In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #34)
> > Can someone please point me to the reproducible case for this crash?
> 
> It's https://bugzilla.mozilla.org/show_bug.cgi?id=737780#c5

Comment 6 seems to refute that -- can you confirm? Better yet, can you verify this is fixed in Firefox 13.0b1?
Whiteboard: startupcrash [qa?] → startupcrash [qa+]
(In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #36)
> (In reply to Stephen Donner [:stephend] from comment #35)
> > (In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #34)
> > > Can someone please point me to the reproducible case for this crash?
> > 
> > It's https://bugzilla.mozilla.org/show_bug.cgi?id=737780#c5
> 
> Comment 6 seems to refute that -- can you confirm? Better yet, can you
> verify this is fixed in Firefox 13.0b1?

Comment 5 is the correct test case. I fixed this bug specifically by using it. It doesn't repro 100% of the time.
I can no longer reproduce this using the testcase I had from comment 5, which lead to the fix for this signature: http://screencast.com/t/bza8wzVMkK
     
I _can_ however, reproduce the same bug 743140, which means the fix from nightlies made it safely to 13.0b1.
     
Adding [qa!], status-firefox13:verified to the whiteboard.
     
Build ID is: Mozilla/5.0 (Windows NT 5.1; rv:13.0) Gecko/20100101 Firefox/13.0, which I got from ftp://ftp.mozilla.org/pub/firefox/candidates/13.0b1-candidates/build1/win32/en-US/Firefox%20Setup%2013.0b1.exe
Whiteboard: startupcrash [qa+] → startupcrash [qa!] status-firefox13:verified
(In reply to Stephen Donner [:stephend] from comment #38)
> Adding [qa!], status-firefox13:verified to the whiteboard.

Thanks Stephen. Sorry I wasn't clear on IRC. status-firefox13:verified is a tracking flag (on the right side). I've fixed it.
Whiteboard: startupcrash [qa!] status-firefox13:verified → startupcrash [qa!]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: