Intermittent GECKO(1136) | Assertion failure: js::CheckGrayMarkingState(mJSRuntime), at z:/build/build/src/xpcom/base/CycleCollectedJSRuntime.cpp:1207

NEW
Assigned to

Status

()

defect
P5
normal
2 years ago
2 days ago

People

(Reporter: intermittent-bug-filer, Assigned: jonco)

Tracking

({intermittent-failure, leave-open})

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [stockwell unknown])

Attachments

(5 attachments)

Filed by: archaeopteryx [at] coole-files.de

https://treeherder.mozilla.org/logviewer.html#?job_id=131009814&repo=autoland

https://queue.taskcluster.net/v1/task/Ql1gEelxQD2EP70evR_W-A/runs/0/artifacts/public/logs/live_backing.log

...
11:46:15     INFO -  GECKO(1136) |   from Object 000001DD62354880 fun_environment edge
11:46:15     INFO -  GECKO(1136) |   from ObjectGroup 000001DD63E19DF0 group_proto edge
11:46:15     INFO -  GECKO(1136) |   from Object 000001DD63483640 group edge
11:46:15     INFO -  GECKO(1136) |   from Object 000001DD685F2520 object slot edge
11:46:15     INFO -  GECKO(1136) |   from Object 000001DD6888D8C0 object slot edge
11:46:15     INFO -  GECKO(1136) |   from ObjectGroup 000001DD5E351070 group_proto edge
11:46:15     INFO -  GECKO(1136) |   from Object 000001DD69394CA0 group edge
11:46:15     INFO -  GECKO(1136) |   from root Preserved wrapper
11:46:15     INFO -  GECKO(1136) | Assertion failure: js::CheckGrayMarkingState(mJSRuntime), at z:/build/build/src/xpcom/base/CycleCollectedJSRuntime.cpp:1207
11:46:29     INFO -  GECKO(1136) | #01: nsCycleCollector::BeginCollection(ccType,nsICycleCollectorListener *) [xpcom/base/nsCycleCollector.cpp:3909]
11:46:29     INFO -  GECKO(1136) | #02: nsCycleCollector::Collect(ccType,js::SliceBudget &,nsICycleCollectorListener *,bool) [xpcom/base/nsCycleCollector.cpp:3761]
11:46:29     INFO -  GECKO(1136) | #03: nsCycleCollector_collectSlice(js::SliceBudget &,bool) [xpcom/base/nsCycleCollector.cpp:4315]
11:46:29     INFO -  GECKO(1136) | #04: nsJSContext::RunCycleCollectorSlice(mozilla::TimeStamp) [dom/base/nsJSEnvironment.cpp:1542]
11:46:29     INFO -  GECKO(1136) | #05: CCRunnerFired [dom/base/nsJSEnvironment.cpp:1936]
11:46:29     INFO -  GECKO(1136) | #06: std::_Func_impl<bool (*)(mozilla::TimeStamp),std::allocator<int>,bool,mozilla::TimeStamp>::_Do_call(mozilla::TimeStamp &&) [vs2015u3/VC/include/functional:214]
11:46:29     INFO -  GECKO(1136) | #07: mozilla::IdleTaskRunner::Run() [xpcom/threads/IdleTaskRunner.cpp:62]
11:46:29     INFO -  GECKO(1136) | #08: mozilla::TimedOut [xpcom/threads/IdleTaskRunner.cpp:85]
11:46:29     INFO -  GECKO(1136) | #09: nsTimerImpl::Fire(int) [xpcom/threads/nsTimerImpl.cpp:515]
11:46:29     INFO -  GECKO(1136) | #10: nsTimerEvent::Run() [xpcom/threads/TimerThread.cpp:289]
11:46:29     INFO -  GECKO(1136) | #11: nsThread::ProcessNextEvent(bool,bool *) [xpcom/threads/nsThread.cpp:1040]
11:46:29     INFO -  GECKO(1136) | #12: NS_ProcessNextEvent(nsIThread *,bool) [xpcom/threads/nsThreadUtils.cpp:521]
11:46:29     INFO -  GECKO(1136) | #13: mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate *) [ipc/glue/MessagePump.cpp:97]
11:46:29     INFO -  GECKO(1136) | #14: MessageLoop::RunHandler() [ipc/chromium/src/base/message_loop.cc:320]
11:46:29     INFO -  GECKO(1136) | #15: MessageLoop::Run() [ipc/chromium/src/base/message_loop.cc:300]
11:46:29     INFO -  GECKO(1136) | #16: nsBaseAppShell::Run() [widget/nsBaseAppShell.cpp:160]
11:46:29     INFO -  GECKO(1136) | #17: nsAppShell::Run() [widget/windows/nsAppShell.cpp:232]
11:46:29     INFO -  GECKO(1136) | #18: nsAppStartup::Run() [toolkit/components/startup/nsAppStartup.cpp:288]
11:46:29     INFO -  GECKO(1136) | #19: XREMain::XRE_mainRun() [toolkit/xre/nsAppRunner.cpp:4697]
11:46:29     INFO -  GECKO(1136) | #20: XREMain::XRE_main(int,char * * const,mozilla::BootstrapConfig const &) [toolkit/xre/nsAppRunner.cpp:4861]
11:46:29     INFO -  GECKO(1136) | #21: XRE_main(int,char * * const,mozilla::BootstrapConfig const &) [toolkit/xre/nsAppRunner.cpp:4956]
11:46:29     INFO -  GECKO(1136) | #22: do_main [browser/app/nsBrowserApp.cpp:237]
11:46:29     INFO -  GECKO(1136) | #23: NS_internal_main(int,char * *,char * *) [browser/app/nsBrowserApp.cpp:311]
11:46:29     INFO -  GECKO(1136) | #24: wmain [toolkit/xre/nsWindowsWMain.cpp:118]
11:46:29     INFO -  GECKO(1136) | #25: __scrt_common_main_seh [f:/dd/vctools/crt/vcstartup/src/startup/exe_common.inl:253]
11:46:29     INFO -  GECKO(1136) | #26: KERNEL32.DLL + 0x12774
11:46:29     INFO -  GECKO(1136) | #27: ntdll.dll + 0x70d61
11:46:29     INFO -  GECKO(1136) | Hit MOZ_CRASH(Aborting on channel error.) at z:/build/build/src/ipc/glue/MessageChannel.cpp:2544
11:46:29     INFO -  GECKO(1136) | ### ERROR: CreateThread: Access is denied.
This is a gray marking issue and the logs indicate that it's for a ProxyObject's private slot.  Here's a patch to add assertions when setting this slot.  Not a fix.
Assignee: nobody → jcoppeard
Attachment #8909886 - Flags: review?(sphink)
Here's a patch to output object class names when a gray marking problem is detected, which should make this easier to track down.  Not a fix.
Attachment #8909889 - Flags: review?(sphink)
Keywords: leave-open
Attachment #8909886 - Flags: review?(sphink) → review+
Attachment #8909889 - Flags: review?(sphink) → review+
Pushed by jcoppeard@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/fbdfaa4bae2d
Add gray marking assertions when setting proxy target r=sfink
https://hg.mozilla.org/integration/mozilla-inbound/rev/a079a0a2971d
Dump more info on gray marking failure r=sfink
Recent crashes are like:

[task 2017-09-25T18:26:11.827Z] 18:26:11     INFO - GECKO(1096) | Found black to gray edge to Object XULElement 0xe53dffa0
[task 2017-09-25T18:26:11.831Z] 18:26:11     INFO - GECKO(1096) |   from Object Proxy 0xd1869c70 proxy target edge
[task 2017-09-25T18:26:11.835Z] 18:26:11     INFO - GECKO(1096) |   from Object Object 0xe3133a60 object slot edge
[task 2017-09-25T18:26:11.843Z] 18:26:11     INFO - GECKO(1096) |   from Object Proxy 0xe53ded30 proxy target edge
[task 2017-09-25T18:26:11.845Z] 18:26:11     INFO - GECKO(1096) |   from Object XULElement 0xe53dffa0 object slot edge
[task 2017-09-25T18:26:11.847Z] 18:26:11     INFO - GECKO(1096) |   from root Preserved wrapper
I still can't figure out what's going on here.

Here's a patch to dump even more information when we detect things have gone wrong in debug builds.
Attachment #8924506 - Flags: review?(pbone)
Comment on attachment 8924506 [details] [diff] [review]
bug1399866-more-gray-marking-info

Review of attachment 8924506 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good to me. r+
Attachment #8924506 - Flags: review?(pbone) → review+
Pushed by jcoppeard@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/203ea4c2406f
Dump even more info on gray marking failure r=pbone
Duplicate of this bug: 1359049
Duplicate of this bug: 1406105
Duplicate of this bug: 1388185
Duplicate of this bug: 1421982
Duplicate of this bug: 1425223
We're seeing a black proxy object with a gray target.  

Add an assert that we never create a proxy with a gray target.  Currently we only assert we never create a black to gray edge, but a newly created proxy will not be black outside of incremental GC.
Attachment #8937160 - Flags: review?(sphink)
Attachment #8937160 - Flags: review?(sphink) → review+
Pushed by jcoppeard@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/dd94a26765f3
Assert proxy private is not gray on creation r=sfink
Duplicate of this bug: 1425902
Duplicate of this bug: 1427578
Duplicate of this bug: 1425558
Duplicate of this bug: 1423726
Duplicate of this bug: 1431880
Duplicate of this bug: 1431890
There have been a total of 32 failures in the last week, according to Orange Factor.
All the failures occur on debug.

As per platform, below you can see the no. of occurrences for each platform:
-linux64-stylo-disabled: 10
-windows10-64-ccov: 9
-Linux x64: 8
-Windows 7: 2
-linux32-stylo-disabled: 1
-OS X 10.10: 1
-windows10-64: 1

Here is a recent log file and a snippet with the failure:
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=159260208&lineNumber=4081

[task 2018-01-30T05:10:11.082Z] 05:10:11     INFO - GECKO(2040) |     "engines": null (shape 7f25e57cb318 enumerate slot 34)
[task 2018-01-30T05:10:11.196Z] 05:10:11     INFO - GECKO(2040) | MEMORY STAT | vsize 1669MB | residentFast 189MB | heapAllocated 34MB
[task 2018-01-30T05:10:11.563Z] 05:10:11     INFO - GECKO(2040) | Assertion failure: js::CheckGrayMarkingState(mJSRuntime), at /builds/worker/workspace/build/src/xpcom/base/CycleCollectedJSRuntime.cpp:1231

It seems that all dependencies have been resolved.
:jonco , would you please take a look at this?
Flags: needinfo?(jcoppeard)
Whiteboard: [stockwell needswork]
(In reply to Tiberius Oros[:tiberius_oros] from comment #42)
I'm currently working on resolving a talos regression (bug 1433128).  I'll take a look at this next.
I found somewhere we don't have gray marking asserts: the public/Proxy.h header proves some functions that are used by gecko to write proxy slots.  This patch adds assertions to ensure we don't create black to gray edges.
Flags: needinfo?(jcoppeard)
Attachment #8951343 - Flags: review?(sphink)
Attachment #8951343 - Flags: review?(sphink) → review+
Pushed by jcoppeard@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/42eb9bb1033e
Add gray marking asserts to proxy write functions r=sfink
I filed bug 1440739 to improve our gray marking asserts further.
Duplicate of this bug: 1448207
This is a good candidate for rr.
Flags: needinfo?(overholt)
Flags: needinfo?(overholt)

Recent failure log:
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=225652005&repo=autoland&lineNumber=20772

06:53:59 INFO - GECKO(1076) | Assertion failure: js::CheckGrayMarkingState(mJSRuntime), at /builds/worker/workspace/build/src/xpcom/base/CycleCollectedJSRuntime.cpp:1119
21680 06:55:49 INFO - PROCESS-CRASH | Main app process exited normally | application crashed [@ mozilla::CycleCollectedJSRuntime::CheckGrayBits() const]
24917 06:55:50 INFO - TEST-UNEXPECTED-FAIL | leakcheck | tab missing output line for total leaks!
33236 07:04:13 ERROR - Return code: 1
33237 07:04:13 ERROR - Got 1 unexpected crashes
33282 [taskcluster:error] exit status 1

You need to log in before you can comment on or make changes to this bug.