Closed
Bug 1405980
Opened 7 years ago
Closed 7 years ago
Intermittent GECKO(4316) | Assertion failure: zone->isGCScheduled(), at z:/build/build/src/js/src/jsgc.cpp:7046
Categories
(Core :: JavaScript: GC, defect, P5)
Core
JavaScript: GC
Tracking
()
RESOLVED
FIXED
mozilla58
Tracking | Status | |
---|---|---|
firefox58 | --- | fixed |
People
(Reporter: intermittent-bug-filer, Assigned: jonco)
References
Details
(Keywords: intermittent-failure)
Attachments
(1 file, 1 obsolete file)
2.56 KB,
patch
|
sfink
:
review+
|
Details | Diff | Splinter Review |
Filed by: archaeopteryx [at] coole-files.de
https://treeherder.mozilla.org/logviewer.html#?job_id=135091654&repo=autoland
https://queue.taskcluster.net/v1/task/Y2PMoYKrQPWI2l34pLdg1w/runs/0/artifacts/public/logs/live_backing.log
06:13:01 INFO - GECKO(4316) | Assertion failure: zone->isGCScheduled(), at z:/build/build/src/js/src/jsgc.cpp:7046
06:13:01 INFO - GECKO(4316) | [GPU 4128, Main Thread] WARNING: Shutting down GPU process early due to a crash!: file z:/build/build/src/gfx/ipc/GPUParent.cpp, line 407
06:13:01 INFO - GECKO(4316) | Hit MOZ_CRASH(Aborting on channel error.) at z:/build/build/src/ipc/glue/MessageChannel.cpp:2543
06:13:01 INFO - GECKO(4316) | #01: mozilla::ipc::MessageChannel::OnChannelErrorFromLink() [ipc/glue/MessageChannel.cpp:2543]
06:13:01 INFO - GECKO(4316) | #02: base::MessagePumpForIO::WaitForIOCompletion(unsigned long,base::MessagePumpForIO::IOHandler *) [ipc/chromium/src/base/message_pump_win.cc:495]
06:13:01 INFO - GECKO(4316) | #03: base::MessagePumpForIO::WaitForWork() [ipc/chromium/src/base/message_pump_win.cc:472]
06:13:01 INFO - GECKO(4316) | #04: base::MessagePumpForIO::DoRunLoop() [ipc/chromium/src/base/message_pump_win.cc:434]
06:13:01 INFO - GECKO(4316) | #05: base::MessagePumpWin::Run(base::MessagePump::Delegate *) [ipc/chromium/src/base/message_pump_win.h:80]
06:13:01 INFO - GECKO(4316) | #06: MessageLoop::RunInternal() [ipc/chromium/src/base/message_loop.cc:326]
06:13:01 INFO - GECKO(4316) | #07: MessageLoop::RunHandler() [ipc/chromium/src/base/message_loop.cc:320]
06:13:01 INFO - GECKO(4316) | #08: MessageLoop::Run() [ipc/chromium/src/base/message_loop.cc:300]
06:13:01 INFO - GECKO(4316) | #09: base::Thread::ThreadMain() [ipc/chromium/src/base/thread.cc:184]
06:13:01 INFO - GECKO(4316) | #10: `anonymous namespace'::ThreadFunc [ipc/chromium/src/base/platform_thread_win.cc:29]
06:13:01 INFO - GECKO(4316) | #11: kernel32.dll + 0x53c45
06:13:01 INFO - GECKO(4316) | #12: patched_BaseThreadInitThunk [mozglue/build/WindowsDllBlocklist.cpp:824]
06:13:01 INFO - GECKO(4316) | #13: ntdll.dll + 0x637f5
06:13:01 INFO - GECKO(4316) | #14: ntdll.dll + 0x637c8
06:13:01 INFO - TEST-INFO | Main app process: exit 1
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 3•7 years ago
|
||
Looks like this was triggered by the changes in bug 1405274. The assertion that's failing is the following in GCRuntime::budgetIncrementalGC:
if (zone->isTooMuchMalloc())
MOZ_ASSERT(zone->isGCScheduled());
My current thinking is that this is happening because that bug stopped clearing the malloc bytes count for all zones on every GC, regardless of which zones were collected (although I just backed that patch out for telemetry regressions).
I think it's still possible for this assertion to fail though, since there are various situations were we don't collect the atoms zone (e.g. use of AutoKeepAtoms) even though isTooMuchMalloc() may be true.
Blocks: 1405274
Assignee | ||
Comment 4•7 years ago
|
||
Patch to relax scheduling assertions for the atoms zone.
Assignee: nobody → jcoppeard
Attachment #8917031 -
Flags: review?(sphink)
Comment 5•7 years ago
|
||
Comment on attachment 8917031 [details] [diff] [review]
bug1405980-malloc-zone-assertion
Review of attachment 8917031 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/jsgc.cpp
@@ +7023,5 @@
> if (!zone->canCollect())
> continue;
>
> if (zone->usage.gcBytes() >= zone->threshold.gcTriggerBytes()) {
> + MOZ_ASSERT_IF(!zone->isAtomsZone(), zone->isGCScheduled());
I think this needs a comment. If I am understanding correctly, the only reason why the atoms zone could be different here is if hasHelperThreadZones() was true and is now false. Is that correct? Because otherwise, zone->canCollect() should have already skipped the zone.
More generally, this assertion seems to depend on canCollect() staying "no more true" than it was when the GC was triggered. (false -> true is the problematic transition.) There isn't any other way that could happen, is there? eg a zone getting transitioned between zone groups, or createdForHelperThread() becoming false.
Attachment #8917031 -
Flags: review?(sphink) → review+
Assignee | ||
Comment 6•7 years ago
|
||
(In reply to Steve Fink [:sfink] [:s:] from comment #5)
> If I am understanding correctly, the only
> reason why the atoms zone could be different here is if
> hasHelperThreadZones() was true and is now false. Is that correct? Because
> otherwise, zone->canCollect() should have already skipped the zone.
That is correct... and looking at the backtrace for the assertion failure shows that this isn't what is happening here, as it's failing in the first slice of a GC. So my patch is not the answer.
Assignee | ||
Updated•7 years ago
|
Attachment #8917031 -
Attachment is obsolete: true
Assignee | ||
Comment 7•7 years ago
|
||
I don't know what's going on here, so this patch is to print some debugging information on failure. Hopefully this will shed some light.
Attachment #8917418 -
Flags: review?(sphink)
Updated•7 years ago
|
Attachment #8917418 -
Flags: review?(sphink) → review+
Pushed by jcoppeard@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/8cdff44958c8
Dump debugging information if zones are not scheduled as expected r=sfink
![]() |
||
Comment 9•7 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 7 years ago
status-firefox58:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 12•7 years ago
|
||
(In reply to Marco Castelluccio [:marco] from comment #11)
There were more changes to the malloc triggers done in bug 1408375. I expect that fixed it.
Flags: needinfo?(jcoppeard)
You need to log in
before you can comment on or make changes to this bug.
Description
•