Closed
Bug 1155423
Opened 9 years ago
Closed 8 years ago
Intermittent 632781-verybig.html | application crashed [@ mozilla::ipc::MessageChannel::AssertWorkerThread() const]
Categories
(Core :: Graphics: Layers, defect)
Tracking
()
RESOLVED
WORKSFORME
Tracking | Status | |
---|---|---|
e10s | + | --- |
People
(Reporter: KWierso, Assigned: mchang)
References
(Blocks 1 open bug)
Details
(Keywords: crash, intermittent-failure, Whiteboard: [gfx-noted])
14:17:19 INFO - REFTEST TEST-START | file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/bugs/632781-verybig.html 14:17:19 INFO - REFTEST TEST-LOAD | file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/bugs/632781-verybig.html | 4112 / 6039 (68%) 14:17:19 INFO - [Child 1916] WARNING: NS_ENSURE_TRUE(node) failed: file /builds/slave/m-in-l64-d-0000000000000000000/build/src/layout/base/nsDocumentViewer.cpp, line 3530 14:17:19 INFO - [Child 1916] WARNING: NS_ENSURE_TRUE(node) failed: file /builds/slave/m-in-l64-d-0000000000000000000/build/src/layout/base/nsDocumentViewer.cpp, line 3530 14:17:19 INFO - [Child 1916] WARNING: NS_ENSURE_TRUE(node) failed: file /builds/slave/m-in-l64-d-0000000000000000000/build/src/layout/base/nsDocumentViewer.cpp, line 3530 14:17:19 INFO - ++DOMWINDOW == 62 (0x7eff96de4400) [pid = 1916] [serial = 13789] [outer = 0x7eff9d190400] 14:17:20 INFO - --DOMWINDOW == 61 (0x7eff932e9400) [pid = 1916] [serial = 13757] [outer = (nil)] [url = http://localhost:58885/1429216986095/165/bugs/621253-helper-externalFilter.xhtml] 14:17:20 INFO - --DOMWINDOW == 60 (0x7eff96cb7000) [pid = 1916] [serial = 13748] [outer = (nil)] [url = data:text/html;charset=UTF-8,%3C%21%2D%2DCLEAR%2D%2D%3E] 14:17:20 INFO - --DOMWINDOW == 59 (0x7eff96bbe800) [pid = 1916] [serial = 13745] [outer = (nil)] [url = http://localhost:58885/1429216986086/164/bugs/621253-helper-externalFilter.xhtml] 14:17:20 INFO - --DOMWINDOW == 58 (0x7eff96e45800) [pid = 1916] [serial = 13751] [outer = (nil)] [url = file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/bugs/621253-helper-internalFilter.xhtml] 14:17:20 INFO - --DOMWINDOW == 57 (0x7eff97645400) [pid = 1916] [serial = 13743] [outer = (nil)] [url = http://localhost:58885/1429216986086/164/bugs/621253-1-externalFilter.html] 14:17:20 INFO - --DOMWINDOW == 56 (0x7eff96bb2800) [pid = 1916] [serial = 13729] [outer = (nil)] [url = http://localhost:58885/1429216986066/162/bugs/615121-2-notref.html] 14:17:20 INFO - --DOMWINDOW == 55 (0x7eff96d98800) [pid = 1916] [serial = 13730] [outer = (nil)] [url = data:text/html;charset=UTF-8,%3C%21%2D%2DCLEAR%2D%2D%3E] 14:17:20 INFO - --DOMWINDOW == 54 (0x7eff96da0c00) [pid = 1916] [serial = 13731] [outer = (nil)] [url = file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/bugs/617242-1.html] 14:17:20 INFO - --DOMWINDOW == 53 (0x7eff932e8800) [pid = 1916] [serial = 13732] [outer = (nil)] [url = file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/bugs/617242-1-ref.html] 14:17:20 INFO - --DOMWINDOW == 52 (0x7eff96bb6000) [pid = 1916] [serial = 13733] [outer = (nil)] [url = data:text/html;charset=UTF-8,%3C%21%2D%2DCLEAR%2D%2D%3E] 14:17:20 INFO - --DOMWINDOW == 51 (0x7eff96dadc00) [pid = 1916] [serial = 13728] [outer = (nil)] [url = http://localhost:58885/1429216986066/162/bugs/615121-2.html] 14:17:20 INFO - --DOMWINDOW == 50 (0x7eff96bbe400) [pid = 1916] [serial = 13746] [outer = (nil)] [url = http://localhost:58885/1429216986086/164/bugs/621253-helper-externalFilter.xhtml] 14:17:20 INFO - --DOMWINDOW == 49 (0x7eff96bbf800) [pid = 1916] [serial = 13736] [outer = (nil)] [url = data:text/html;charset=UTF-8,%3C%21%2D%2DCLEAR%2D%2D%3E] 14:17:20 INFO - --DOMWINDOW == 48 (0x7eff96cbd800) [pid = 1916] [serial = 13740] [outer = (nil)] [url = http://localhost:58885/1429216986080/163/bugs/619511-1.html] 14:17:20 INFO - --DOMWINDOW == 47 (0x7eff96da0800) [pid = 1916] [serial = 13735] [outer = (nil)] [url = file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/bugs/618071-notref.html] 14:17:20 INFO - --DOMWINDOW == 46 (0x7eff96caf400) [pid = 1916] [serial = 13741] [outer = (nil)] [url = http://localhost:58885/1429216986080/163/bugs/619511-1-ref.html] 14:17:20 INFO - --DOMWINDOW == 45 (0x7eff96eb6800) [pid = 1916] [serial = 13747] [outer = (nil)] [url = http://localhost:58885/1429216986086/164/bugs/621253-1-ref.html] 14:17:20 INFO - --DOMWINDOW == 44 (0x7eff96e4a800) [pid = 1916] [serial = 13739] [outer = (nil)] [url = data:text/html;charset=UTF-8,%3C%21%2D%2DCLEAR%2D%2D%3E] 14:17:20 INFO - --DOMWINDOW == 43 (0x7eff96eab800) [pid = 1916] [serial = 13734] [outer = (nil)] [url = file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/bugs/618071.html] 14:17:20 INFO - --DOMWINDOW == 42 (0x7eff96cb2800) [pid = 1916] [serial = 13742] [outer = (nil)] [url = data:text/html;charset=UTF-8,%3C%21%2D%2DCLEAR%2D%2D%3E] 14:17:20 INFO - --DOMWINDOW == 41 (0x7eff96bb5800) [pid = 1916] [serial = 13738] [outer = (nil)] [url = file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/bugs/619117-1-ref.html] 14:17:20 INFO - --DOMWINDOW == 40 (0x7eff96bbd400) [pid = 1916] [serial = 13737] [outer = (nil)] [url = file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/bugs/619117-1.html] 14:17:20 INFO - [Child 1916] WARNING: NS_ENSURE_TRUE(node) failed: file /builds/slave/m-in-l64-d-0000000000000000000/build/src/layout/base/nsDocumentViewer.cpp, line 3530 14:17:20 INFO - [Child 1916] WARNING: NS_ENSURE_TRUE(node) failed: file /builds/slave/m-in-l64-d-0000000000000000000/build/src/layout/base/nsDocumentViewer.cpp, line 3530 14:17:20 INFO - [Child 1916] WARNING: NS_ENSURE_TRUE(node) failed: file /builds/slave/m-in-l64-d-0000000000000000000/build/src/layout/base/nsDocumentViewer.cpp, line 3530 14:17:20 INFO - [Child 1916] WARNING: Most platforms still need an optimized way to share GL cross process.: file /builds/slave/m-in-l64-d-0000000000000000000/build/src/gfx/layers/client/CanvasClient.cpp, line 42 14:17:33 INFO - firefox: Fatal IO error 11 (Resource temporarily unavailable) on X server :0. 14:17:33 INFO - plugin-container: Fatal IO error 11 (Resource temporarily unavailable) on X server :0. 14:17:33 INFO - nsStringStats 14:17:33 INFO - => mAllocCount: 143 14:17:33 INFO - => mReallocCount: 1 14:17:33 INFO - => mFreeCount: 136 -- LEAKED 7 !!! 14:17:33 INFO - => mShareCount: 297 14:17:33 INFO - => mAdoptCount: 0 14:17:33 INFO - => mAdoptFreeCount: 0 14:17:33 INFO - => Process ID: 1990, Thread ID: 140472436787648 14:17:33 INFO - plugin-container: Fatal IO error 12 (Cannot allocate memory) on X server :0. 14:17:33 INFO - nsStringStats 14:17:33 INFO - => mAllocCount: 3765567 14:17:33 INFO - => mReallocCount: 251630 14:17:33 INFO - => mFreeCount: 3754832 -- LEAKED 10735 !!! 14:17:33 INFO - => mShareCount: 6665733 14:17:33 INFO - => mAdoptCount: 463234 14:17:33 INFO - => mAdoptFreeCount: 463232 -- LEAKED 2 !!! 14:17:33 INFO - => Process ID: 1916, Thread ID: 139636700649920 14:17:33 INFO - Assertion failure: mIOLoop == MessageLoop::current() (not on I/O thread!), at ../../dist/include/mozilla/ipc/MessageLink.h:147 14:17:34 INFO - Hit MOZ_CRASH(Layer not thread-safe) at /builds/slave/m-in-l64-d-0000000000000000000/build/src/gfx/layers/Layers.h:738 14:17:34 INFO - g_dbus_connection_real_closed: Remote peer vanished with error: Underlying GIOStream returned 0 bytes on an async read (g-io-error-quark, 0). Exiting. 14:17:34 WARNING - TEST-UNEXPECTED-FAIL | file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/bugs/632781-verybig.html | application terminated with exit code -15 14:17:44 INFO - mozcrash INFO | Saved minidump as /builds/slave/test/build/blobber_upload_dir/1783d515-4f1e-330c-54f31b09-11abb6b2.dmp 14:17:44 WARNING - PROCESS-CRASH | file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/bugs/632781-verybig.html | application crashed [@ mozilla::ipc::MessageChannel::AssertWorkerThread() const] 14:17:44 INFO - Crash dump filename: /tmp/tmpwHz2ON.mozrunner/minidumps/1783d515-4f1e-330c-54f31b09-11abb6b2.dmp 14:17:44 INFO - Operating system: Linux 14:17:44 INFO - 0.0.0 Linux 3.2.0-76-generic #111-Ubuntu SMP Tue Jan 13 22:16:09 UTC 2015 x86_64 14:17:44 INFO - CPU: amd64 14:17:44 INFO - family 6 model 62 stepping 4 14:17:44 INFO - 1 CPU 14:17:44 INFO - Crash reason: SIGSEGV 14:17:44 INFO - Crash address: 0xc 14:17:44 INFO - Thread 23 (crashed) 14:17:44 INFO - 0 libxul.so!mozilla::ipc::MessageChannel::AssertWorkerThread() const [MessageChannel.h:c75413a564b2 : 417 + 0x5] 14:17:44 INFO - rbx = 0x000000000000000e r12 = 0x00007fe53b2747f0 14:17:44 INFO - r13 = 0x0000000000000001 r14 = 0x0000000000000701 14:17:44 INFO - r15 = 0x0000000000000000 rip = 0x00007fe564ef6b99 14:17:44 INFO - rsp = 0x00007fe54a6feb50 rbp = 0x00007fe54a6feb60 14:17:44 INFO - Found by: given as instruction pointer in context 14:17:44 INFO - 1 libxul.so!mozilla::ipc::MessageChannel::CxxStackFrame::CxxStackFrame(mozilla::ipc::MessageChannel&, mozilla::ipc::Direction, IPC::Message const*) [MessageChannel.cpp:c75413a564b2 : 219 + 0x7] 14:17:44 INFO - rbx = 0x00007fe54a6febc8 r12 = 0x00007fe53b2747f0 14:17:44 INFO - r13 = 0x0000000000000001 r14 = 0x0000000000000701 14:17:44 INFO - r15 = 0x0000000000000000 rip = 0x00007fe564efd009 14:17:44 INFO - rsp = 0x00007fe54a6feb70 rbp = 0x00007fe54a6febb0 14:17:44 INFO - Found by: call frame info 14:17:44 INFO - 2 libxul.so!mozilla::ipc::MessageChannel::Send(IPC::Message*) [MessageChannel.cpp:c75413a564b2 : 536 + 0x4] 14:17:44 INFO - rbx = 0x00007fe5408bac68 r12 = 0x00007fe53b2747f0 14:17:44 INFO - r13 = 0x00007fe54a6fec58 r14 = 0x0000000000000701 14:17:44 INFO - r15 = 0x0000000000000000 rip = 0x00007fe564f04381 14:17:44 INFO - rsp = 0x00007fe54a6febc0 rbp = 0x00007fe54a6febf0 14:17:44 INFO - Found by: call frame info 14:17:44 INFO - 3 libxul.so!mozilla::layout::PVsyncParent::SendNotify(mozilla::TimeStamp const&) [PVsyncParent.cpp:c75413a564b2 : 64 + 0xb] 14:17:44 INFO - rbx = 0x00007fe5409de0e0 r12 = 0x00007fe53b2747f0 14:17:44 INFO - r13 = 0x00007fe54a6fec58 r14 = 0x0000000000000701 14:17:44 INFO - r15 = 0x0000000000000000 rip = 0x00007fe5650410d4 14:17:44 INFO - rsp = 0x00007fe54a6fec00 rbp = 0x00007fe54a6fec40 14:17:44 INFO - Found by: call frame info 14:17:44 INFO - 4 libxul.so!mozilla::layout::VsyncParent::DispatchVsyncEvent(mozilla::TimeStamp) [VsyncParent.cpp:c75413a564b2 : 71 + 0x8] 14:17:44 INFO - rbx = 0x00007fe54aa87990 r12 = 0x00007fe55267c701 14:17:44 INFO - r13 = 0x00007fe54a6fed0f r14 = 0x0000000000000701 14:17:44 INFO - r15 = 0x0000000000000000 rip = 0x00007fe5665a11da 14:17:44 INFO - rsp = 0x00007fe54a6fec50 rbp = 0x00007fe54a6fec60 14:17:44 INFO - Found by: call frame info 14:17:44 INFO - 5 libxul.so!nsRunnableMethodImpl<void (mozilla::layout::VsyncParent::*)(mozilla::TimeStamp), true, mozilla::TimeStamp>::Run() [nsThreadUtils.h:c75413a564b2 : 615 + 0x16] 14:17:44 INFO - rbx = 0x00007fe54aa87990 r12 = 0x00007fe55267c701 14:17:44 INFO - r13 = 0x00007fe54a6fed0f r14 = 0x0000000000000701 14:17:44 INFO - r15 = 0x0000000000000000 rip = 0x00007fe5665a1208 14:17:44 INFO - rsp = 0x00007fe54a6fec70 rbp = 0x00007fe54a6fec70 14:17:44 INFO - Found by: call frame info 14:17:44 INFO - 6 libxul.so!nsThread::ProcessNextEvent(bool, bool*) [nsThread.cpp:c75413a564b2 : 866 + 0x11] 14:17:44 INFO - rbx = 0x00007fe54aa87990 r12 = 0x00007fe55267c701 14:17:44 INFO - r13 = 0x00007fe54a6fed0f r14 = 0x0000000000000701 14:17:44 INFO - r15 = 0x0000000000000000 rip = 0x00007fe564c811cd 14:17:44 INFO - rsp = 0x00007fe54a6fec80 rbp = 0x00007fe54a6fecf0 14:17:44 INFO - Found by: call frame info 14:17:44 INFO - 7 libxul.so!NS_ProcessNextEvent(nsIThread*, bool) [nsThreadUtils.cpp:c75413a564b2 : 265 + 0xc] 14:17:44 INFO - rbx = 0x00007fe54a7aef01 r12 = 0x00007fe55267c760 14:17:44 INFO - r13 = 0x00007fe54a7aefe8 r14 = 0x0000000000000700 14:17:44 INFO - r15 = 0x00007fe54a7aefe0 rip = 0x00007fe564ca006e 14:17:44 INFO - rsp = 0x00007fe54a6fed00 rbp = 0x00007fe54a6fed20 14:17:44 INFO - Found by: call frame info 14:17:44 INFO - 8 libxul.so!mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) [MessagePump.cpp:c75413a564b2 : 368 + 0xd] 14:17:44 INFO - rbx = 0x00007fe54a7aefc0 r12 = 0x00007fe55267c760 14:17:44 INFO - r13 = 0x00007fe54a7aefe8 r14 = 0x0000000000000700 14:17:44 INFO - r15 = 0x00007fe54a7aefe0 rip = 0x00007fe564efe29c 14:17:44 INFO - rsp = 0x00007fe54a6fed30 rbp = 0x00007fe54a6fed80 14:17:44 INFO - Found by: call frame info 14:17:44 INFO - 9 libxul.so!MessageLoop::RunInternal() [message_loop.cc:c75413a564b2 : 233 + 0x16] 14:17:44 INFO - rbx = 0x00007fe55267c760 r12 = 0x00007fe55267c760 14:17:44 INFO - r13 = 0x00007fe54aa879b0 r14 = 0x0000000000000773 14:17:44 INFO - r15 = 0x0000000000000001 rip = 0x00007fe564ee0901 14:17:44 INFO - rsp = 0x00007fe54a6fed90 rbp = 0x00007fe54a6fedc0 14:17:44 INFO - Found by: call frame info 14:17:44 INFO - 10 libxul.so!MessageLoop::Run() [message_loop.cc:c75413a564b2 : 226 + 0x7] 14:17:44 INFO - rbx = 0x00007fe55267c760 r12 = 0x00007fe55267c760 14:17:44 INFO - r13 = 0x00007fe54aa879b0 r14 = 0x0000000000000773 14:17:44 INFO - r15 = 0x0000000000000001 rip = 0x00007fe564ee0928 14:17:44 INFO - rsp = 0x00007fe54a6fedd0 rbp = 0x00007fe54a6fee00 14:17:44 INFO - Found by: call frame info 14:17:44 INFO - 11 libxul.so!nsThread::ThreadFunc(void*) [nsThread.cpp:c75413a564b2 : 364 + 0x7] 14:17:44 INFO - rbx = 0x00007fe54aa87990 r12 = 0x00007fe55267c760 14:17:44 INFO - r13 = 0x00007fe54aa879b0 r14 = 0x0000000000000773 14:17:44 INFO - r15 = 0x0000000000000001 rip = 0x00007fe564c81aaf 14:17:44 INFO - rsp = 0x00007fe54a6fee10 rbp = 0x00007fe54a6fee50 14:17:44 INFO - Found by: call frame info 14:17:44 INFO - 12 libnspr4.so!_pt_root [ptthread.c:c75413a564b2 : 212 + 0x6] 14:17:44 INFO - rbx = 0x00007fe54aaf56a0 r12 = 0x0000000000000000 14:17:44 INFO - r13 = 0x0000000000000773 r14 = 0x0000000000000773 14:17:44 INFO - r15 = 0x0000000000000001 rip = 0x00007fe56aae6ad8 14:17:44 INFO - rsp = 0x00007fe54a6fee60 rbp = 0x00007fe54a6feea0 14:17:44 INFO - Found by: call frame info 14:17:44 INFO - 13 libpthread-2.15.so + 0x7e99 14:17:44 INFO - rbx = 0x0000000000000000 r12 = 0x00007fff3c4e2cb0 14:17:44 INFO - r13 = 0x00007fe54a6ff9c0 r14 = 0x0000000000000000 14:17:44 INFO - r15 = 0x0000000000000003 rip = 0x00007fe56bf2de9a 14:17:44 INFO - rsp = 0x00007fe54a6feeb0 rbp = 0x0000000000000000 14:17:44 INFO - Found by: call frame info 14:17:44 INFO - Thread 0
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Updated•9 years ago
|
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Updated•9 years ago
|
tracking-e10s:
--- → ?
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Updated•9 years ago
|
Blocks: e10s-tests
Updated•9 years ago
|
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Could you take a look at this Mason? It seems to be some sort of vsync issue.
Component: IPC → Graphics: Layers
Flags: needinfo?(mchang)
Assignee | ||
Updated•9 years ago
|
Assignee: nobody → mchang
Status: NEW → ASSIGNED
Flags: needinfo?(mchang)
Assignee | ||
Comment 22•9 years ago
|
||
I'm not sure this is actually a vsync issue versus that's what the crash dump caught in the parent process. Before the assertion failure, I see this stack trace from the log in [1]. 20:26:10 INFO - Assertion failure: mIOLoop == MessageLoop::current() (not on I/O thread!), at ../../dist/include/mozilla/ipc/MessageLink.h:147 20:26:10 INFO - Hit MOZ_CRASH(Layer not thread-safe) at /builds/slave/b2g-in-l64-d-00000000000000000/build/src/gfx/layers/Layers.h:732 20:26:10 INFO - #01: mozilla::ipc::ProcessLink::OnChannelError() [ipc/glue/MessageLink.cpp:401] 20:26:10 INFO - #01: std::_Rb_tree<unsigned long, std::pair<unsigned long const, mozilla::layers::CompositorParent::LayerTreeState>, std::_Select1st<std::pair<unsigned long const, mozilla::layers::CompositorParent::LayerTreeState> >, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, mozilla::layers::CompositorParent::LayerTreeState> > >::_M_erase(std::_Rb_tree_node<std::pair<unsigned long const, mozilla::layers::CompositorParent::LayerTreeState> >*) [memory/mozalloc/mozalloc.h:210] 20:26:10 INFO - #02: std::_Rb_tree<unsigned long, std::pair<unsigned long const, mozilla::layers::CompositorParent::LayerTreeState>, std::_Select1st<std::pair<unsigned long const, mozilla::layers::CompositorParent::LayerTreeState> >, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, mozilla::layers::CompositorParent::LayerTreeState> > >::_M_erase(std::_Rb_tree_node<std::pair<unsigned long const, mozilla::layers::CompositorParent::LayerTreeState> >*) [gfx/layers/ipc/LayerTransactionChild.cpp:141] Layers are not thread safe ref counted[2]. I wonder if the child process is crashing and then we send a vsync message and get this failure from the parent process. That would make more sense for a couple of reasons. First, the VsyncParent is on the background thread of the parent process and asserts that it's on the background thread [3]. This test is also the longest running test in the suite, and takes 21 seconds to run according to the log, which means we've been sending vsync messages for 21 seconds already. In addition, this reftest is in the middle of the suite, so it's not a start up / shutdown issue as the other reftests have been using vsync. Unless the background thread changes it's message loop or channel halfway between, we shouldn't have switched message loop threads. Also, looking at the log, I checked out thread 0, the main thread, to see the stack trace and get this: 20:26:23 INFO - 13 libxul.so!mozilla::layers::Layer::Release() [Layers.h:cd229a55ed1c : 732 + 0x17] 20:26:23 INFO - rip = 0x00007f9af87022fa rsp = 0x00007fffa3f95eb0 20:26:23 INFO - rbp = 0x00007fffa3f96270 20:26:23 INFO - Found by: stack scanning 20:26:23 INFO - 14 libxul.so!std::_Rb_tree<unsigned long, std::pair<unsigned long const, mozilla::layers::CompositorParent::LayerTreeState>, std::_Select1st<std::pair<unsigned long const, mozilla::layers::CompositorParent::LayerTreeState> >, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, mozilla::layers::CompositorParent::LayerTreeState> > >::_M_erase(std::_Rb_tree_node<std::pair<unsigned long const, mozilla::layers::CompositorParent::LayerTreeState> >*) [stl_pair.h:cd229a55ed1c : 88 + 0x8] 20:26:23 INFO - rbx = 0x00007f9add9c08a0 r12 = 0x00007f9afccb5520 20:26:23 INFO - rip = 0x00007f9af877071f rsp = 0x00007fffa3f96280 20:26:23 INFO - rbp = 0x00007fffa3f962a0 20:26:23 INFO - Found by: call frame info 20:26:23 INFO - 15 libxul.so!std::_Rb_tree<unsigned long, std::pair<unsigned long const, mozilla::layers::CompositorParent::LayerTreeState>, std::_Select1st<std::pair<unsigned long const, mozilla::layers::CompositorParent::LayerTreeState> >, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, mozilla::layers::CompositorParent::LayerTreeState> > >::_M_erase(std::_Rb_tree_node<std::pair<unsigned long const, mozilla::layers::CompositorParent::LayerTreeState> >*) [stl_tree.h:cd229a55ed1c : 1082 + 0xb] 20:26:23 INFO - rbx = 0x00007f9add9c0790 r12 = 0x00007f9afccb5520 We're releasing a layer on the main thread. The only thing that has a map<unsigned long, LayerTreeState> is sIndirectLayerTrees [4] and that usually releases a layer on the compositor thread[5]. All the other times we delete sIndirectLayerTree[id].mRoot, which is a layer, is on the compositor thread. The CrossProcessCompositorParent can still erase a layer at [6] which I'm not sure is the main thread or not. Have a quick test on try to make sure we always erase on the compositor thread: https://treeherder.mozilla.org/#/jobs?repo=try&revision=4c4d9e2fd16a [1] http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/b2g-inbound-linux64-debug/1431050459/b2g-inbound_ubuntu64_vm-debug_test-reftest-e10s-1-bm117-tests1-linux64-build5.txt.gz [2] https://dxr.mozilla.org/mozilla-central/source/gfx/layers/Layers.h#732 [3] https://dxr.mozilla.org/mozilla-central/source/layout/ipc/VsyncParent.cpp?from=VsyncParent.cpp&case=true#63 [4] https://dxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/CompositorParent.cpp#101 [5] https://dxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/CompositorParent.cpp#717 [6] https://dxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/CompositorParent.cpp#1889
The logs certainly are pretty confusing. I just noticed that, right before we assert, these messages are printed: 20:26:10 INFO - plugin-container: Fatal IO error 11 (Resource temporarily unavailable) on X server :0. 20:26:10 INFO - firefox: Fatal IO error 11 (Resource temporarily unavailable) on X server :0. I wonder if there's some sort of X-level failure that we're not prepared for and it causes all sorts of badness down the line. Given that the test is called 632781-verybig.html, it wouldn't be too surprising that we would get some sort of allocation failure.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 25•9 years ago
|
||
Hmm, so looks like on try, reftests-e10s does not run on debug builds explicitly :/. @Ryan - Who can we talk to if we want to run reftest-e10s on try? Thanks. @Nical - Does comment 22 make sense to you? Is there anything we can check on Linux for comment 23? I remember you saying there was an issue where we can take down the X server if something was too big on Linux during the gfx dailies. Where can we check that? Thanks!
Flags: needinfo?(ryanvm)
Flags: needinfo?(nical.bugzilla)
Comment 26•9 years ago
|
||
(In reply to Mason Chang [:mchang] from comment #25) > @Ryan - Who can we talk to if we want to run reftest-e10s on try? Thanks. File a Release Engineering::General Automation bug for it.
Flags: needinfo?(ryanvm)
A workaround is to use |-u all|. That will run e10s debug reftests.
Assignee | ||
Comment 28•9 years ago
|
||
(In reply to Bill McCloskey (:billm) from comment #23) > The logs certainly are pretty confusing. I just noticed that, right before > we assert, these messages are printed: > > 20:26:10 INFO - plugin-container: Fatal IO error 11 (Resource > temporarily unavailable) on X server :0. > 20:26:10 INFO - firefox: Fatal IO error 11 (Resource temporarily > unavailable) on X server :0. > > I wonder if there's some sort of X-level failure that we're not prepared for > and it causes all sorts of badness down the line. Given that the test is > called 632781-verybig.html, it wouldn't be too surprising that we would get > some sort of allocation failure. https://treeherder.mozilla.org/#/jobs?repo=try&revision=f0471b19c66d That seems to be the case. The assertion I put in to assert that the layer state is removed on the compositor thread never failed, but in the log we got this: 19:27:44 INFO - firefox: Fatal IO error 11 (Resource temporarily unavailable) on X server :0. 19:27:44 INFO - plugin-container: Fatal IO error 11 (Resource temporarily unavailable) on X server :0. .... .... 19:27:45 INFO - plugin-container: Fatal IO error 12 (Cannot allocate memory) on X server :0. http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mchang@mozilla.com-f0471b19c66d/try-linux64-debug/try_ubuntu64_vm-debug_test-reftest-e10s-1-bm115-tests1-linux64-build62.txt.gz
Comment 29•9 years ago
|
||
(In reply to Mason Chang [:mchang] from comment #25) > Hmm, so looks like on try, reftests-e10s does not run on debug builds > explicitly :/. > > @Ryan - Who can we talk to if we want to run reftest-e10s on try? Thanks. > > @Nical - Does comment 22 make sense to you? Yup > Is there anything we can check > on Linux for comment 23? I remember you saying there was an issue where we > can take down the X server if something was too big on Linux during the gfx > dailies. Where can we check that? Thanks! My proposition was to throttle the size of canvases to some sane limit, but that's mostly band aid. In the dailies the answer I got was, let's just keep working on removing our dependency to xrender. As far as I understand we don't have a way to handle this specific allocation failure which happens asynchronously in the x server but I could be wrong (I hope I am wrong). I haven't been able to reproduce this one locally (my x server gets killed by the LMK when I force it to allocate too many big surfaces which is also a lot of fun).
Flags: needinfo?(nical.bugzilla)
Assignee | ||
Comment 30•9 years ago
|
||
Looking at the test, this looks like a giant 10000x10000 canvas - https://dxr.mozilla.org/mozilla-central/source/layout/reftests/bugs/632781-verybig.html Is there a specific place we can assert that we got an allocation failure for this limit? I just want to double check that we're explicitly hitting this allocation failure and not something else. Thanks!
Flags: needinfo?(nical.bugzilla)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 33•9 years ago
|
||
(In reply to Mason Chang [:mchang] from comment #30) > Looking at the test, this looks like a giant 10000x10000 canvas - > https://dxr.mozilla.org/mozilla-central/source/layout/reftests/bugs/632781- > verybig.html > > Is there a specific place we can assert that we got an allocation failure > for this limit? I just want to double check that we're explicitly hitting > this allocation failure and not something else. Thanks! I'd look at all of the places where we allocate the surface in DrawTargetCairo.cpp, as well as whether we return null in TextureClient::CreateForDrawing. But it could still be a failure that happens asynchronously in the x server and the firefox process gets killed without getting a chance to handle the error (not sure).
Flags: needinfo?(nical.bugzilla)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 58•8 years ago
|
||
Is this failure still happening?
Flags: needinfo?(mchang)
Whiteboard: [gfx-noted]
Assignee | ||
Comment 59•8 years ago
|
||
IIRC we don't use xrender anymore either.
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Flags: needinfo?(mchang)
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•