Closed Bug 774052 Opened 12 years ago Closed 12 years ago

crash in js::types::TypeObject::sweep

Categories

(Core :: XPConnect, defect)

15 Branch
All
Windows 7
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla17
Tracking Status
firefox15 + fixed
firefox16 + verified
firefox17 + verified

People

(Reporter: scoobidiver, Assigned: bholley)

References

Details

(4 keywords, Whiteboard: [js:inv:p3])

Crash Data

It's a low volume crash in Beta and Aurora but it began to spike in 16.0a1/20120714 and is now #2 top crasher in today's build. The regression range for the spike is:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=6489be1890c0&tochange=0602e44ac248

Signature 	js::types::TypeObject::sweep(js::FreeOp*) More Reports Search
UUID	6ceb09c9-99ca-4a0f-82e0-847222120715
Date Processed	2012-07-15 02:57:55
Uptime	7710
Last Crash	1.6 weeks before submission
Install Age	2.1 hours since version was first installed.
Install Time	2012-07-15 00:49:15
Product	Firefox
Version	16.0a1
Build ID	20120714030554
Release Channel	nightly
OS	Windows NT
OS Version	6.1.7601 Service Pack 1
Build Architecture	x86
Build Architecture Info	GenuineIntel family 6 model 23 stepping 10
Crash Reason	EXCEPTION_ACCESS_VIOLATION_READ
Crash Address	0x26264708
App Notes 	
AdapterVendorID: 0x10de, AdapterDeviceID: 0x08a0, AdapterSubsysID: 00c2106b, AdapterDriverVersion: 8.17.11.9682
D3D10 Layers? D3D10 Layers- D3D9 Layers? D3D9 Layers- WebGL? WebGL- 
Processor Notes 	WARNING: JSON file missing Add-ons
EMCheckCompatibility	False
Adapter Vendor ID	0x10de
Adapter Device ID	0x08a0
Total Virtual Memory	4294836224
Available Virtual Memory	3491704832
System Memory Use Percentage	70
Available Page File	4278034432
Available Physical Memory	1186983936

Frame 	Module 	Signature 	Source
0 	mozjs.dll 	js::types::TypeObject::sweep 	js/src/jsinfer.cpp:5482
1 	mozjs.dll 	js::types::TypeCompartment::sweep 	js/src/jsinfer.cpp:5553
2 	mozjs.dll 	JSCompartment::sweep 	js/src/jscompartment.cpp:545
3 	mozjs.dll 	SweepPhase 	js/src/jsgc.cpp:3403
4 	mozjs.dll 	GCCycle 	js/src/jsgc.cpp:3854
5 	mozjs.dll 	Collect 	js/src/jsgc.cpp:3958
6 	mozjs.dll 	js::GCSlice 	js/src/jsgc.cpp:3991
7 	mozjs.dll 	js::NotifyDidPaint 	js/src/jsfriendapi.cpp:779
8 	xul.dll 	nsXPConnect::NotifyDidPaint 	js/xpconnect/src/nsXPConnect.cpp:2790
9 	xul.dll 	PresShell::DidPaint 	layout/base/nsPresShell.cpp:7047
10 	xul.dll 	nsViewManager::CallDidPaintOnObserver 	view/src/nsViewManager.cpp:1348
11 	xul.dll 	nsViewManager::DispatchEvent 	view/src/nsViewManager.cpp:775
12 	xul.dll 	AttachedHandleEvent 	view/src/nsView.cpp:159
13 	xul.dll 	nsWindow::DispatchEvent 	widget/windows/nsWindow.cpp:3509
14 	xul.dll 	nsWindow::DispatchWindowEvent 	widget/windows/nsWindow.cpp:3535
15 	xul.dll 	nsWindow::OnPaint 	widget/windows/nsWindowGfx.cpp:607
16 	xul.dll 	nsWindow::ProcessMessage 	widget/windows/nsWindow.cpp:4743
17 	xul.dll 	nsWindow::WindowProcInternal 	widget/windows/nsWindow.cpp:4330
18 	xul.dll 	CallWindowProcCrashProtected 	xpcom/base/nsCrashOnException.cpp:32
19 	xul.dll 	nsWindow::WindowProc 	widget/windows/nsWindow.cpp:4272
20 	user32.dll 	InternalCallWinProc 
...

More reports at:
https://crash-stats.mozilla.com/report/list?signature=js%3A%3Atypes%3A%3ATypeObject%3A%3Asweep%28js%3A%3AFreeOp*%29
Looks like it calmed down again.
Whiteboard: [js:inv:p3]
(In reply to David Mandelin [:dmandelin] from comment #1)
> Looks like it calmed down again.
I disagree. It's #7 top browser crasher in 17.0a1 and still occurs in the latest Nightly.

It also spiked in 15.0a2/20120716. The Aurora regression range is:
http://hg.mozilla.org/releases/mozilla-aurora/pushloghtml?fromchange=50963e16d1dc&tochange=c511865c8ab1
There are three bugs that belong to the two regression ranges: bug 771202, bug 773250, bug 772684. The two last ones are about Firefox for Android, so it's caused by bug 771202, an XPConnect bug.
Blocks: 771202
Version: 16 Branch → 15 Branch
If this is indeed a regression from bug 771202, it's probably some sort of compartment/GC issue. Billm, is that what this looks like to you?

STR would be very helpful here.
Keywords: qawanted
Depends on: 775435
(In reply to Bobby Holley (:bholley) from comment #3)
> If this is indeed a regression from bug 771202, it's probably some sort of
> compartment/GC issue. 

I found a GC hazard in those patches by inspection which I filed as bug 775435. Let's try to get that landed and see if it fixes the issue here.
Keywords: qawanted
The patch over at bug 775435 just landed on inbound. Let's see if it reduces crash volume.

http://hg.mozilla.org/integration/mozilla-inbound/rev/6bff6810a82e
Do we know yet whether this reduced crash volume?
For one thing, I miss the note that it landed on m-c.

For the other, https://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A17.0a1&version=Firefox%3A16.0a1&query_search=signature&query_type=contains&query=js%3A%3Atypes%3A%3ATypeObject%3A%3Asweep&reason_type=contains&date=07%2F23%2F2012 14%3A05%3A59&range_value=1&range_unit=weeks&hang_type=any&process_type=any&do_query=1&signature=js%3A%3Atypes%3A%3ATypeObject%3A%3Asweep(js%3A%3AFreeOp*) does not seem to show the signature gone, yesterday's build still has 16 crashes.
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #7)
> does not seem to show the signature gone, yesterday's build still has 16
> crashes.

Can we tell if that's a reduction? If I read this bug right, there were other crashes with this signature before the spike associated with bug 771202.
Bobby, it doesn't look like a reduction in yesterday's build. When did this land on m-c?
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #9)
> Bobby, it doesn't look like a reduction in yesterday's build. When did this
> land on m-c?

from bug 775435 comment 5 @ 2012-07-20 06:53:41 PDT:
https://hg.mozilla.org/mozilla-central/rev/6bff6810a82e
Hrm, that would mean that the -07-21 build would be the first that would have the patch. I guess we need a few more days until we see really meaningful stats, but from the numbers it looks like it hasn't helped much, at least.
This is holding at #8 topcrasher on 17, and it's been 4 days since landing on m-c.  I see that bug 775435 just landed on aurora/beta but it looks unlikely that we'll see a drop there and this crash signature just moved up 5 spots in the 15 top crashers to be #8 there as well.  Slacker radio is mentioned several times in the comments between versions, adding QA wanted to see if we can try to get some STR here playing radio on Slacker.com.
johns is investigating some random oranges that might be related to this. John, did you manage to reduce a reliable testcase?
I have a very finicky testcase that only seems to work with my patches in 745030 applied for some reason, I'm looking into this more now
jst suggested turning on GCZeal and running the mochitests in dom/plugins (and also browsing the crashing sites with gczeal enabled).
I can reproduce a crash on one of the sites that comes up as a top URL in these crashes using the latest beta on Mac 10.6.8, with a slightly different stack.

STR:
1. Load http://www.goalsarena.org/
2. Wait for the banner ad at the bottom to come up.
3. Click the back button.

https://crash-stats.mozilla.com/report/index/bp-c8101270-f82b-4500-b37a-809232120731

Stack that comes up for me is the same as in Bug 728687.

http://www.spanishdict.com/translation is the other site besides goalsarena.org that has the most crashes.
Keywords: reproducible
I find Socorro hard to use, especially with rapid release, so can someone help me out and see if I'm doing this right? I expanded the query in comment 7 to 4 weeks and reset the date to 'now', then separated out a few relevant versions, 16a1 (trunk 16), 17a1 (trunk 17), and 16a2 (aurora 16) (btw, I did this by munging the URL, is there an easier way):

https://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A17.0a1&query_search=signature&query_type=contains&query=js%3A%3Atypes%3A%3ATypeObject%3A%3Asweep&reason_type=contains&date=now&range_value=4&range_unit=weeks&hang_type=any&process_type=any&do_query=1&signature=js%3A%3Atypes%3A%3ATypeObject%3A%3Asweep%28js%3A%3AFreeOp*%29

https://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A16.0a1&query_search=signature&query_type=contains&query=js%3A%3Atypes%3A%3ATypeObject%3A%3Asweep&reason_type=contains&date=now&range_value=4&range_unit=weeks&hang_type=any&process_type=any&do_query=1&signature=js%3A%3Atypes%3A%3ATypeObject%3A%3Asweep%28js%3A%3AFreeOp*%29

https://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A16.0a2&query_search=signature&query_type=contains&query=js%3A%3Atypes%3A%3ATypeObject%3A%3Asweep&reason_type=contains&date=now&range_value=4&range_unit=weeks&hang_type=any&process_type=any&do_query=1&signature=js%3A%3Atypes%3A%3ATypeObject%3A%3Asweep%28js%3A%3AFreeOp*%29

What I see is that this crash first occurred with a high rate in the 7/14 trunk 16 build. That continued over to trunk 17, up to and including today. It also continued to Aurora 16, but skipping 7/17-7/19. I assume that's from a delay between switching trunk and users running Aurora, either on our end, or users taking time to get it.

Based on comment 2, I also looked at 15a2 (aurora 15):

https://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A15.0a2&query_search=signature&query_type=contains&query=js%3A%3Atypes%3A%3ATypeObject%3A%3Asweep&reason_type=contains&date=now&range_value=4&range_unit=weeks&hang_type=any&process_type=any&do_query=1&signature=js%3A%3Atypes%3A%3ATypeObject%3A%3Asweep%28js%3A%3AFreeOp*%29

I see a high frequency on 7/16, but only on that day.

Putting it all together, I'm seeing:

  regressed on   16 trunk  on 7/14
  regressed on   15 aurora on 7/16
  unregressed on 15 aurora on 7/17

Did I get it right?

I'll do some analysis assuming I did, but in a following comment to stay away from the URL clutter in this one.
Now to analyze the raw data: 

As Scoobi says in comment 2, bug 771202 is the only relevant thing that landed in the 15 Aurora regression range (just before 7/16). The crash unregressed the next day. The next day there was a huge merge from L10N, so it's plausible that something that fixed or masked the bug came in that day. Notably, bug 775435 was not in that merge, which is evidence that it's not the fix for this problem.

The crash regressed on trunk on 7/14. Bug 771202 landed in that regression range as well (at about 8pm 7/13).

I also took Marcia's test in comment 16 and tried it on the 7/13 and 7/14 nightly builds. It does in fact crash in 7/14 but not 7/13, so it probably is related to this bug. (Nice find! I would have missed that.)

Anyway, it seems pretty likely to be bug 771202, and I think bisecting builds down the changeset would be a relatively easy way to check that. Or just debugging using the STR in comment 16.

(By the way, is there an easy way to get the regression ranges? I just went by date, e.g., for a regression on 7/14 I ran http://hg.mozilla.org/mozilla-central/pushloghtml?startdate=2012-07-13&enddate=2012-07-15. But it seems like there should be a better way.)
Assignee: general → nobody
Component: JavaScript Engine → XPConnect
dmandelin, AFAIK there's no easier way to find the regression range via Socorro, and we're also usually doing the mapping that to hg that way, though one could get to FTP to find out the changesets used for the builds and make it somewhat more precise, but it's also more complicated to do that.
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #20)
> dmandelin, AFAIK there's no easier way to find the regression range via
> Socorro, and we're also usually doing the mapping that to hg that way,
> though one could get to FTP to find out the changesets used for the builds
> and make it somewhat more precise, but it's also more complicated to do that.

I suspected as much. Well, good on you guys for getting those regression ranges all the time without much toolage. It seems like we should look for some kind of upgrade, though. I've thought of the FTP thing, and I suppose it would work, but I also wonder if we could get set up a data warehouse of information related to releases and builds.
Regression window with str of comment #16

Regression window(m-c)
Good:
http://hg.mozilla.org/mozilla-central/rev/08c5d1085a44
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/16.0 Firefox/16.0 ID:20120713174821
Crahes:
http://hg.mozilla.org/mozilla-central/rev/2a1283c673d5
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/16.0 Firefox/16.0 ID:20120713201421
Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=08c5d1085a44&tochange=2a1283c673d5


Regression window(m-i)
Good:
http://hg.mozilla.org/integration/mozilla-inbound/rev/bdc169cbbbac
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/16.0 Firefox/16.0 ID:20120713000301
Crashes:
http://hg.mozilla.org/integration/mozilla-inbound/rev/e9203900ce6c
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/16.0 Firefox/16.0 ID:20120713015705
Pushlog:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=bdc169cbbbac&tochange=e9203900ce6c
Regression window(aurora)
Good:
http://hg.mozilla.org/releases/mozilla-aurora/rev/434a32b42d93
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/20120715 Firefox/15.0a2 ID:20120715153020
Crash:
http://hg.mozilla.org/releases/mozilla-aurora/rev/308cccec8e82
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/20120715 Firefox/15.0a2 ID:20120715153619
Pushlog:
http://hg.mozilla.org/releases/mozilla-aurora/pushloghtml?fromchange=434a32b42d93&tochange=308cccec8e82
Sounds like me!
Assignee: nobody → bobbyholley+bmo
Looks like we have a definitive regression range on this bug. Please re-add qawanted if something more is needed.
Keywords: qawanted
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
(In reply to Daniel Veditz [:dveditz] from comment #26)
> this ought to be fixed based on bug 775435

Sorry, I need to reopen. That one has landed on Aurora on 7/24, but this signature is still high on 16.0a2 even in later builds, including e.g. the 8/1 build.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
billm has a patch over in bug 779849 that I've verified to fix this crash using the STR in comment 16. Let's get that patch landed ASAP.
Depends on: 779849
This is landed over in bug 779849 and will go out in beta 4, the signature seems to be going away in nightly/aurora too. Updating status flags.
(In reply to Lukas Blakk [:lsblakk] from comment #29)
> This is landed over in bug 779849 and will go out in beta 4, the signature
> seems to be going away in nightly/aurora too. Updating status flags.

\o/

Thanks for following this up over here too!
I guess it's a definitive fix and not a backout in branches.

There are no crashes after 17.0a1/20120803.
Status: NEW → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla17
Status: RESOLVED → VERIFIED
There are no crashes after 16.0a2/20120809.
"There are no crashes after 16.0a2" <--- ****.
You need to log in before you can comment on or make changes to this bug.