Status

()

defect
--
critical
8 years ago
2 months ago

People

(Reporter: marcia, Assigned: sfink)

Tracking

(4 keywords)

Trunk
x86
All
Points:
---

Firefox Tracking Flags

(firefox17 affected, firefox18+ wontfix, firefox19+ wontfix, firefox20- wontfix, firefox26 wontfix, firefox27 affected, firefox28 affected, firefox29 affected, firefox46 affected, firefox47 affected, firefox48 affected, firefox49 affected, firefox-esr45 affected, thunderbird_esr52 affected, firefox50 affected, firefox51 affected, firefox52 wontfix, firefox53 affected, firefox54 affected)

Details

(Whiteboard: [leave open][tbird topcrash], crash signature)

Attachments

(1 attachment, 1 obsolete attachment)

Small volume crash seen on the trunk. https://crash-stats.mozilla.com/report/list?signature=js::gc::ScanRope

https://crash-stats.mozilla.com/report/index/e0fff0ea-95d4-44f0-bce8-26da12110629

Frame 	Module 	Signature [Expand] 	Source
0 	xul.dll 	js::gc::ScanRope 	js/src/jsgcmark.cpp:544
1 	nspr4.dll 	PR_ExitMonitor 	nsprpub/pr/src/threads/prmon.c:132
2 	xul.dll 	js::GCMarker::drainMarkStack 	js/src/jsgcmark.cpp:761
3 	xul.dll 	js::MarkWeakReferences 	js/src/jsgc.cpp:1831
4 	xul.dll 	CalcMaxProgressCallback 	uriloader/base/nsDocLoader.cpp:1584
5 	xul.dll 	XPCJSRuntime::TraceJS 	js/src/xpconnect/src/xpcjsruntime.cpp:377
6 	xul.dll 	js::MarkRuntime 	js/src/jsgc.cpp:1875
7 	xul.dll 	MarkAndSweep 	js/src/jsgc.cpp:2324
8 	nspr4.dll 	PR_Unlock 	nsprpub/pr/src/threads/combined/prulock.c:347
9 	xul.dll 	GCCycle 	js/src/jsgc.cpp:2678
10 	xul.dll 	js_GC 	js/src/jsgc.cpp:2743
11 	xul.dll 	XPCCallContext::XPCCallContext 	js/src/xpconnect/src/xpccallcontext.cpp:63
12 	xul.dll 	nsStandardURL::BuildNormalizedSpec 	netwerk/base/src/nsStandardURL.cpp:637
13 	xul.dll 	JS_CompartmentGC 	js/src/jsapi.cpp:2616
14 	xul.dll 	nsXPConnect::Collect 	js/src/xpconnect/src/nsXPConnect.cpp:413
15 	xul.dll 	xul.dll@0x29900f 	
16 	xul.dll 	mozilla::imagelib::RasterImage::DoComposite 	modules/libpr0n/src/RasterImage.cpp:1788
17 	xul.dll 	nsThread::HasPendingEvents 	xpcom/threads/nsThread.cpp:502
18 	xul.dll 	nsBaseAppShell::OnProcessNextEvent 	widget/src/xpwidgets/nsBaseAppShell.cpp:333
19 	xul.dll 	mozilla::imagelib::RasterImage::Notify 	modules/libpr0n/src/RasterImage.cpp:1544
20 	nspr4.dll 	PR_Unlock 	nsprpub/pr/src/threads/combined/prulock.c:347
21 	xul.dll 	xul.dll@0x29900f 	
22 	xul.dll 	nsXPConnect::GarbageCollect 	js/src/xpconnect/src/nsXPConnect.cpp:421
23 	xul.dll 	TimerThread::AddTimerInternal 	xpcom/threads/TimerThread.cpp:451
24 	xul.dll 	nsJSContext::GarbageCollectNow 	dom/base/nsJSEnvironment.cpp:3252
25 	xul.dll 	xul.dll@0x29900f 	
26 	xul.dll 	nsTimerImpl::Fire 	xpcom/threads/nsTimerImpl.cpp:424
27 	xul.dll 	nsTimerEvent::Run 	xpcom/threads/nsTimerImpl.cpp:520
28 	xul.dll 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:617
29 	nspr4.dll 	MD_CURRENT_THREAD 	nsprpub/pr/src/md/windows/w95thred.c:308
30 	xul.dll 	MessageLoop::DoWork 	ipc/chromium/src/base/message_loop.cc:436
31 	xul.dll 	xul.dll@0x1a44df 	
32 	xul.dll 	mozilla::ipc::MessagePump::Run 	ipc/glue/MessagePump.cpp:134
33 	xul.dll 	xul.dll@0x39b64f 	
34 	mozcrt19.dll 	realloc 	obj-firefox/memory/jemalloc/crtsrc/jemalloc.c:5990
35 	xul.dll 	MessageLoop::RunHandler 	ipc/chromium/src/base/message_loop.cc:202
36 	nspr4.dll 	PR_GetThreadPrivate 	nsprpub/pr/src/threads/prtpd.c:232
37 	xul.dll 	MessageLoop::Run 	ipc/chromium/src/base/message_loop.cc:176
38 	xul.dll 	nsObserverList::AddObserver 	xpcom/ds/nsObserverList.cpp:50
39 	xul.dll 	MessageLoop::current 	ipc/chromium/src/base/message_loop.cc:87
40 	xul.dll 	nsBaseAppShell::Run 	widget/src/xpwidgets/nsBaseAppShell.cpp:189
41 	xul.dll 	nsAppStartup::Run 	toolkit/components/startup/nsAppStartup.cpp:222
42 	xul.dll 	XRE_main 	toolkit/xre/nsAppRunner.cpp:3573
43 	kernel32.dll 	CloseHandleImplementation 	
44 	mozcrt19.dll 	arena_dalloc_small 	obj-firefox/memory/jemalloc/crtsrc/jemalloc.c:4045
45 	mozcrt19.dll 	arena_dalloc_small 	obj-firefox/memory/jemalloc/crtsrc/jemalloc.c:4045
46 	kernel32.dll 	RtlFillMemoryStub 	
47 	mozcrt19.dll 	arena_malloc_small 	obj-firefox/memory/jemalloc/crtsrc/jemalloc.c:3675
48 	xul.dll 	nsAString_internal::Assign 	xpcom/string/src/nsTSubstring.cpp:336
49 	xul.dll 	nsAString_internal::Assign 	xpcom/string/src/nsTSubstring.cpp:396
50 	xul.dll 	nsLocalFile::InitWithPath 	xpcom/io/nsLocalFileWin.cpp:887
51 	xul.dll 	nsLocalFile::SetLeafName 	xpcom/io/nsLocalFileWin.cpp:1269
52 	xul.dll 	nsRefPtr<nsPresContext>::~nsRefPtr<nsPresContext> 	obj-firefox/dist/include/nsAutoPtr.h:969
53 	xul.dll 	XRE_CreateAppData 	toolkit/xre/nsAppData.cpp:140
54 	firefox.exe 	wmain 	toolkit/xre/nsWindowsWMain.cpp:107
55 	mozcrt19.dll 	mozcrt19.dll@0x235f 	
56 	ntdll.dll 	LdrpProcessStaticImports 	
57 	ntdll.dll 	LdrpProcessStaticImports 	
58 	ntdll.dll 	LdrLoadDll 	
59 	ntdll.dll 	RtlAllocateMemoryBlockLookaside 	
60 	ntdll.dll 	RtlAllocateMemoryBlockLookaside 	
61 	ntdll.dll 	RtlAllocateMemoryBlockLookaside 	
62 	ntdll.dll 	CsrpConnectToServer
I don't know if it is related, but the change that introduced MarkWeakReferences (seen here in the stack) has been backed out.  Bug 653248.  It should mostly be a refactoring from the perspective of ropes, but who knows.

Comment 3

8 years ago
This seems to happen more often lately, esp. on 7: https://crash-stats.mozilla.com/report/list?signature=js%3A%3Agc%3A%3AScanRope
It's #40 top crasher in 7.0.1 and correlated with Better Facebook!:
75% (151/201) vs.   1% (436/66766) betterfacebook@mattkruse.com
   0% (1/201) vs.   0% (8/66766) 5.901
   1% (3/201) vs.   0% (7/66766) 5.911
   0% (1/201) vs.   0% (2/66766) 5.921
 46% (92/201) vs.   0% (260/66766) 5.931
 27% (54/201) vs.   0% (159/66766) 5.941

It is probably related to bug 686441.
Summary: Firefox 7.0a1 Crash @ js::gc::ScanRope → Firefox 7.0a1 Crash @ js::gc::ScanRope mainly with Better Facebook

Updated

7 years ago
Duplicate of this bug: 752156
It's #20 top browser crasher in 17.0, #11 in 18.0b1 and #38 in 19.0a2.

70% of crashes happen on Windows XP.
Keywords: topcrash
OS: Windows 7 → Windows XP
Summary: Firefox 7.0a1 Crash @ js::gc::ScanRope mainly with Better Facebook → crash @ js::gc::ScanRope
Juan - can you try reproducing?
QA Contact: jbecerra
I've done some dogfooding around this crash, but I couldn't reproduce the issue, using Firefox 18 beta 2, on Windows XP 32-bit.

User Agent: Mozilla/5.0 (Windows NT 5.1; rv:18.0) Gecko/20100101 Firefox/18.0
Build ID: 20121128060531

I've tried the suggestions from the above comments (loading Facebook) and the suggestions from comments in the crash reports: opening 3 Firefox windows, loading Yahoo mail, loading tabs with Flash content.

The latest crash reports can be found here:

https://crash-stats.mozilla.com/report/list?product=Firefox&query_search=signature&query_type=contains&query=js%3A%3Agc%3A%3AScanRope&reason_type=contains&date=11%2F29%2F2012%2015%3A22%3A10&range_value=4&range_unit=weeks&hang_type=any&process_type=any&do_query=1&signature=js%3A%3Agc%3A%3AScanRope&page=1
(In reply to Manuela Muntean from comment #8)
> I've done some dogfooding around this crash, but I couldn't reproduce the
> issue, using Firefox 18 beta 2, on Windows XP 32-bit.
> 
> User Agent: Mozilla/5.0 (Windows NT 5.1; rv:18.0) Gecko/20100101 Firefox/18.0
> Build ID: 20121128060531
> 
> I've tried the suggestions from the above comments (loading Facebook) and
> the suggestions from comments in the crash reports: opening 3 Firefox
> windows, loading Yahoo mail, loading tabs with Flash content.
> 
> The latest crash reports can be found here:
> 
> https://crash-stats.mozilla.com/report/
> list?product=Firefox&query_search=signature&query_type=contains&query=js%3A%3
> Agc%3A%3AScanRope&reason_type=contains&date=11%2F29%2F2012%2015%3A22%3A10&ran
> ge_value=4&range_unit=weeks&hang_type=any&process_type=any&do_query=1&signatu
> re=js%3A%3Agc%3A%3AScanRope&page=1

To confirm, this testing was done with the BetterFacebook add-on correct?

Adding the add-on author, Matt Kruse. Matt - any ideas on what may have changed between versions 5.921 and 5.931 that may be causing this instability in Firefox?
(actually, this doesn't appear to be correlated to BetterFacebook or any other add-on at this point, apologies Matt)

Naveed - what next steps can we take on the engineering side without explicit STR? Can you investigate any recent changes in FF17 in this area of code?
Assignee: general → nihsanullah

Comment 11

7 years ago
I don't see that correlation with Better Facebook here any more, that was more than a year ago on Firefox 7. The URLs for the signature are pointing mostly to Facebook, though.

Comment 12

7 years ago
This is topcrash #11 on 18.0b2.

(In reply to Alex Keybl [:akeybl] from comment #7)
> Juan - can you try reproducing?

Juan, ping?
Flags: needinfo?(jbecerra)
Without STR this will be hard to work. The 18.0 callstacks appear to indicate that the ptr rope or rope->rightChild is or eventual goes bad in the loop. We are going to try and instrument ScanRope to get more details from crash-stat.
Assignee: nihsanullah → sphink
After talking it over with billm there isn't anything we can do without the STR in this case. If someone is able to make it happen on a debug build so we can see the ASSERT failure we may have fresh ideas.
I have been trying http://socialfixer.com/ (formerly known as Better Facebook), but no luck so far.
Flags: needinfo?(jbecerra) → needinfo?
(In reply to Naveed Ihsanullah from comment #14)
> After talking it over with billm there isn't anything we can do without the
> STR in this case. If someone is able to make it happen on a debug build so
> we can see the ASSERT failure we may have fresh ideas.

Thanks for discussing this further. So there aren't any changes for b4 that would make the root cause more obvious? Any other suggestions you can make to QA to try to reproduce? Our efforts thus far haven't been successful.
Flags: needinfo?
If Naveed can't think of anything we can do in the Beta 19 timeframe, we'll untrack for all upcoming releases.
Take QA ownership as lead for Fx19. Naveed, please advise any QA leads.
QA Contact: jbecerra → anthony.s.hughes

Comment 19

7 years ago
Naveed, is there any speculative or instrumentation thing we can do in the 19 beta timeframe?
Flags: needinfo?(nihsanullah)
Assignee

Comment 20

7 years ago
I hope to get something in. I'm skeptical that dumping out the specific rope fields will tell us much (heck, they're probably the crash address anyway). But it still seems like there ought to be some form of instrumentation useful for this...
(In reply to Steve Fink [:sfink] from comment #20)
> I hope to get something in. I'm skeptical that dumping out the specific rope
> fields will tell us much (heck, they're probably the crash address anyway).
> But it still seems like there ought to be some form of instrumentation
> useful for this...

Do you have an ETA for your work?

Updated

7 years ago
Crash Signature: [@ js::gc::ScanRope ] → [@ js::gc::ScanRope ] [@ ScanRope]
This work hasn't landed yet for FF21, so there's no chance of getting this into FF19b4 (or FF19 for that matter given risk mitigation).
Assignee

Comment 24

7 years ago
(In reply to Alex Keybl [:akeybl] from comment #21)
> (In reply to Steve Fink [:sfink] from comment #20)
> > I hope to get something in. I'm skeptical that dumping out the specific rope
> > fields will tell us much (heck, they're probably the crash address anyway).
> > But it still seems like there ought to be some form of instrumentation
> > useful for this...
> 
> Do you have an ETA for your work?

Unfortunately, no. I have the beginnings of some extensive breakpad instrumentation, but (1) it's a major change, would require extensive review, and I would be surprised if it wasn't accepted at all; and (2) there's no guarantee it would actually turn up anything useful.

Backing off from that, there just isn't enough information in this bug to do much. I can turn on some lightweight assertions in this code to make it crash slightly sooner, but the problem here is that heap memory is getting corrupted, and when GC scans it, it either is or is interpreted as a rope, and it's an invalid rope. We have no way to see what pointed at this rope (well, it wasn't any of the direct recursive calls, since we know the rope was pulled off of the mark stack.)

Somewhat unrelated: I looked at one of the crash dumps in windbg, and noticed why so many of the crash addresses have the mysterious fc0c4 address in them:

63b729a4 8d8c81c4c00f00  lea     ecx,[ecx+eax*4+0FC0C4h]

It's part of markIfUnmarked(). So fc0c4 is just a fancy form of zero, and doesn't really tell us much. The crash I looked at (on fc0c4) is at the instruction that reads the mark bit, so what's happening is that a random (non GCthing) pointer is having its "mark bit" checked, and the computation to find that mark bit gives the invalid address. This will be caught earlier by the patch I will provide, though it would crash either way.
Flags: needinfo?(nihsanullah)
Assignee

Comment 25

7 years ago
Attachment #708375 - Flags: review?(wmccloskey)
Comment on attachment 708375 [details] [diff] [review]
Detect rope corruption earlier in opt builds

Review of attachment 708375 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/Makefile.in
@@ +244,5 @@
>  endif
>  
> +ifeq (,$(filter release esr,$(MOZ_UPDATE_CHANNEL)))
> +DEFINES += -DBUG_668583_INSTRUMENTATION
> +endif

We have JS_CRASH_DIAGNOSTICS for this. It's enabled on nightly and Aurora but nothing else.

::: js/src/gc/Marking.cpp
@@ +901,5 @@
>  ScanRope(GCMarker *gcmarker, JSRope *rope)
>  {
>      ptrdiff_t savedPos = gcmarker->stack.position();
> +#ifdef BUG_668583_INSTRUMENTATION
> +    MOZ_ALWAYS_TRUE(GetGCThingTraceKind(rope) == JSTRACE_STRING);

I don't think this does what you want. In opt builds it just runs the expression and ignores the result.

We used to have JS_OPT_ASSERT, but someone took it out. I think it would be reasonable to add it back. Basically it would be like JS_ASSERT, but it would be enabled for all JS_CRASH_DIAGNOSTICS builds.
Attachment #708375 - Flags: review?(wmccloskey)
Dropping QAWANTED as QA has run out of leads. Please re-add if there's some new leads we can follow.
Keywords: qawanted
Assignee

Comment 28

6 years ago
Uh, yeah, that was foolish of me.
Attachment #712601 - Flags: review?(wmccloskey)
Assignee

Updated

6 years ago
Attachment #708375 - Attachment is obsolete: true
Attachment #712601 - Flags: review?(wmccloskey) → review+
Assignee

Updated

6 years ago
Attachment #712601 - Flags: checkin+
Assignee

Updated

6 years ago
Whiteboard: [leave open]
Steve: what's the status here? Does your patch for corruption detection need to uplift to Beta to get a wider set of results?  Has the landing on nightly helped the investigation here?
Flags: needinfo?(sphink)
Assignee

Comment 32

6 years ago
Comment on attachment 712601 [details] [diff] [review]
Detect rope corruption earlier in opt builds

(In reply to lsblakk):

Either I'm not searching properly, or I haven't gotten any crashes in a version with the diagnostics, even with it in Aurora.

[Approval Request Comment]
Bug caused by (feature/regressing bug #): unknown
User impact if declined: none
Testing completed (on m-c, etc.): on aurora
Risk to taking this patch (and alternatives if risky): hardly any
String or UUID changes made by this patch: none
Attachment #712601 - Flags: approval-mozilla-beta?
Flags: needinfo?(sphink)
(In reply to Steve Fink [:sfink] from comment #32)
> Either I'm not searching properly, or I haven't gotten any crashes in a
> version with the diagnostics, even with it in Aurora.
The signature has morphed since Firefox 20. See https://crash-stats.mozilla.com/report/list?signature=ScanRope
Comment on attachment 712601 [details] [diff] [review]
Detect rope corruption earlier in opt builds

Approving for uplift in support of the investigation, please land asap and take a look into the morphed crash signature to see if your diagnostics are showing up in the new signature's info.
Attachment #712601 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Assignee

Comment 35

6 years ago
Comment on attachment 712601 [details] [diff] [review]
Detect rope corruption earlier in opt builds

Ah! Yes, it is showing up with the new signature, and it is not what I expected. Which is good.

Cancelling the request to uplift to beta, because (1) I already have the data I wanted, and (2) I think the patch as it stands would automatically disable itself on beta.

I need to think about the next step. It may be to get additional information into the minidumps, as it now appears that it may have more to do with ropes than I thought. I was guessing that the pointer it was working on wasn't a rope at all, just some mangled value being misinterpreted as one, but that appears to be wrong. Sadly, we have no way to dump rope (or string) contents into the minidump; it's too conditional. I started working on a way to do that (for this bug), but realized it's a pretty substantial project and not useful in the near term. Still, seeing the pointers within the rope could be enlightening.
Attachment #712601 - Flags: approval-mozilla-beta+
Given the number of cycles this has been left unfixed, we'll accept a low risk uplift once found but will no longer track for a specific release.
just had bp-14e51d61-a985-43d7-8e7e-793c22130426 (dont recall ever having this crash sig before)

bp-12fe6325-843f-437d-92a3-64efa2130426 shortly before that - Bug 864033 - crash in js::ArgumentsObject::trace @ MarkInternal

Comment 38

6 years ago
(In reply to Steve Fink [:sfink] from comment #35)
> I need to think about the next step.

sfink, ping?
Flags: needinfo?(sphink)

Comment 39

6 years ago
Crash encountered by me: bp-4959dfdc-4c23-4991-aaed-fc1c52131016
Related to bug 719114?
Assignee

Comment 40

6 years ago
Ok, I'm really not going to be able to do anything with this in the foreseeable future. Naveed, is there anyone else who might be able to take a look? There's not a whole lot that's actionable here, but maybe somebody can come up with some new ideas. I just won't be able to spend any time on this anytime soon, and I don't want to be holding it up.
Flags: needinfo?(sphink) → needinfo?(nihsanullah)
Keywords: topcrashtopcrash-win

Updated

6 years ago
Wontfix for Firefox 26 as it's EOL with tomorrow's Firefox 27 release.

Comment 44

5 years ago
I have a WinDBG core dump of this crash on Firefox 29.0.1.  If someone at Mozilla wants to take a look at it email me off list.  Ask for file fxdump_519_615.dmp.
Updated signature with a recent marking change. Levels appear unchanged.
Crash Signature: [@ js::gc::ScanRope ] [@ ScanRope] → [@ js::GCMarker::eagerlyMarkChildren(JSRope*)]

Updated

4 years ago
Crash Signature: [@ js::GCMarker::eagerlyMarkChildren(JSRope*)] → [@ js::GCMarker::eagerlyMarkChildren(JSRope*)] [@ js::GCMarker::eagerlyMarkChildren]
It's rank 13 on top-crashers for beta.
Keywords: topcrash
OS: Windows XP → All
and rank 20 for release.
Crash volume for signature 'js::GCMarker::eagerlyMarkChildren':
 - nightly (version 50): 112 crashes from 2016-06-06.
 - esr     (version 45): 1449 crashes from 2016-04-07.

Crash volume on the last 7 days:
             2016-07-21   2016-07-20   2016-07-19   2016-07-18   2016-07-17   2016-07-16   2016-07-15
 - nightly            1            0            4            4            8            4            3
 - esr               33           33           39           40           10            6           35

Affected platforms: Windows, Mac OS X, Linux

Comment 49

2 years ago
Firefox 51.0b9 Crash Report [@ js::GCMarker::eagerlyMarkChildren ]

https://crash-stats.mozilla.com/report/index/eb79e64c-c050-40b0-b728-0c0272170130

https://crash-stats.mozilla.com/report/index/67a12f1a-d578-4e7e-bf19-281752170131

Just got two crashes right after the other on this bug. Still crashing in FF51x
Crash volume for signature 'js::GCMarker::eagerlyMarkChildren':
 - nightly (version 54): 35 crashes from 2017-01-23.
 - aurora  (version 53): 15 crashes from 2017-01-23.
 - beta    (version 52): 934 crashes from 2017-01-23.
 - release (version 51): 2808 crashes from 2017-01-16.
 - esr     (version 45): 7396 crashes from 2016-08-03.

Crash volume on the last weeks (Week N is from 01-30 to 02-05):
            W. N-1  W. N-2  W. N-3  W. N-4  W. N-5  W. N-6  W. N-7
 - nightly      29
 - aurora       11
 - beta        537
 - release    1342       1
 - esr         422     433     383     337     256     308     389

Affected platforms: Windows, Mac OS X, Linux

Crash rank on the last 7 days:
           Browser   Content   Plugin
 - nightly #45       #23
 - aurora  #54       #22
 - beta    #11       #9
 - release #8        #5
 - esr     #45       #4
Too late for firefox 52, mass-wontfix.

Comment 52

2 years ago
Still happening rather often, Firefox 54.0.1 (x64)

bp-b74ccc00-ffea-4fa9-bf6d-8a16f1170725
#45 crash for both Thunderbird 52.2.1 and 55.0b2
Whiteboard: [leave open] → [leave open][tbird topcrash]
Flags: needinfo?(nihsanullah)

Comment 54

2 years ago
Hi folks, I know you all say that versions going way back have been affected by this and similar bugs (Mark Children, etc.), but I regressed all the way back to 52.02 and those crashes ALMOST NEVER OCCUR. This seems to be the last version that is not crashy as all get out on my box anyway. Once I start moving up even a bit past this version, FF gets crashy as Hell and in particular I start seeing a lot of these Rope/Mark Children, etc. (there are a number of different crashes that all seem to be part of the same mess) crashes. Way back here at 52.02, the only thing I get is crashes in a tab, which is annoying, but I am never get full crashes of the entire application.

AFAICT, something changed after 52.02, and whatever changed is responsible for these Rope/Mark Children, etc. crashes, which I am guessing all might be working back around to some basic common destination, my WAG of which is some sort of memory corruption. But I am not a programmer, so that's just a WAG from an amateur.
Assignee

Comment 55

2 years ago
(In reply to Robert Lindsay from comment #54)
> Hi folks, I know you all say that versions going way back have been affected
> by this and similar bugs (Mark Children, etc.), but I regressed all the way
> back to 52.02 and those crashes ALMOST NEVER OCCUR. This seems to be the
> last version that is not crashy as all get out on my box anyway. Once I
> start moving up even a bit past this version, FF gets crashy as Hell and in
> particular I start seeing a lot of these Rope/Mark Children, etc. (there are
> a number of different crashes that all seem to be part of the same mess)
> crashes. Way back here at 52.02, the only thing I get is crashes in a tab,
> which is annoying, but I am never get full crashes of the entire application.
> 
> AFAICT, something changed after 52.02, and whatever changed is responsible
> for these Rope/Mark Children, etc. crashes, which I am guessing all might be
> working back around to some basic common destination, my WAG of which is
> some sort of memory corruption. But I am not a programmer, so that's just a
> WAG from an amateur.

It certainly is memory corruption. The hard question is *why* the memory is getting corrupted. It is intriguing that you seem to be able to get this fairly often, enough to notice a difference in crash rates across versions. (We don't have a handle on any way to reproduce this at all reliably.)

One reason for memory corruption, one that I don't like to use, is bad RAM. But it is more common than most people realize and is certainly responsible for some percentage of the crashes here. What OS are you running? (eg if Windows, what version?) I'd like to know what the results of a RAM checker are on your system, to eliminate that from consideration if nothing else.

My guess is that there is a bug somewhere that eventually shows up as a crash here. If you can figure out any sort of pattern that causes this bug to be more likely to happen, it could provide a helpful clue. Already, the observation that 53 seems to be more crashy for you than 52 is very interesting, though it doesn't seem to match the aggregate statistics as best I can tell.

The tricky thing with this bug is that anything that mangles a JS string will probably end up crashing here, since this is where we scan through all strings in existence. If something were to use a mangled string before the next GC, we would see a crash there. But if for example this is a stray pointer write, then odd are that the corruption will not be noticed until we attempt a GC.

Comment 56

2 years ago
Hi, it is Windows 8. Can you link me to a memory checker, please. Also are the various Mark Children type bugs tied in with this Scan Rope stuff? Because all these bugs are crashing on js::gc:: correct? Not that I know what that means. 

For some reason, I keep thinking that the Mark Children stuff and this Scan Rope stuff are connected. I reach that conclusion on an intuitive basis. Is there a logical reason that could explain how the various bugs could be connected?
Assignee

Comment 57

2 years ago
It looks like this ought to do it: http://support.rm.com/TechnicalArticle.asp?cref=TEC3222505

If this is generalized memory corruption (either bad RAM or some random bug that writes out of bounds), then it would sometimes show up as this bug (eagerlyMarkChildren on a JSRope*) and sometimes as some other mark children calls (and sometimes as other things, though most of them will still be *something* connected to the GC.) So yes, they would be related.

If there is a bug specific to JSRope handling, then it would be unlikely to show up as other signatures.

One clarification: this bug originally showed up as js::gc::ScanRope, then some change made it show up as plain ScanRope, and then further changes moved it to  js::GCMarker::eagerlyMarkChildren(JSRope*). But it's really the same thing, it's just that the code changed names a bit.

Amazingly, still a topcrash for Thunderbird

copied from bug 1441002
"Two weeks ago, after March 26, the Firefox crash rate went up ~30%. Currently ~20% higher compared to pre-March 26"

You need to log in before you can comment on or make changes to this bug.