Closed Bug 1253154 Opened 8 years ago Closed 7 years ago

Intermittent e10s test_fullscreen-api.html | application timed out after 330 seconds with no output

Categories

(Core Graveyard :: Plug-ins, defect, P3)

defect

Tracking

(firefox47 affected)

RESOLVED DUPLICATE of bug 1237853
Tracking Status
firefox47 --- affected

People

(Reporter: philor, Unassigned)

References

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell unknown])

Bulk assigning P3 to all open intermittent bugs without a priority set in Firefox components per bug 1298978.
Priority: -- → P3
:xidorn -- Do you have an idea of what is going wrong here?
Flags: needinfo?(xidorn+moz)
Looks like most recent timeouts happen in the plugin-related test. The test file is: https://dxr.mozilla.org/mozilla-central/source/dom/html/test/file_fullscreen-plugins.html

Probably an issue since removal of NPAPI support? Benjamin, could you have a look?
Flags: needinfo?(xidorn+moz) → needinfo?(benjamin)
I don't know exactly what's going on, but the browser was trying to tear down a plugin instance at the end of the test, and the plugin process appears to be locked up. We don't have testplugin symbols in the log, but here's the stack we do have at the time of the hang (presumably some kind of deadlock):

https://public-artifacts.taskcluster.net/e6udmTARTVW7vyGnSOcLXQ/0/public/logs/live_backing.log

[task 2016-10-14T00:42:53.735329Z] 00:42:53     INFO -  Thread 0 (crashed)
[task 2016-10-14T00:42:53.736787Z] 00:42:53     INFO -   0  libnptest.so + 0x70b4
[task 2016-10-14T00:42:53.759292Z] 00:42:53     INFO -   1  libgtk-x11-2.0.so.0.2400.10 + 0x136e78
[task 2016-10-14T00:42:53.763947Z] 00:42:53     INFO -   2  libgtk-x11-2.0.so.0.2400.10 + 0x136de0
[task 2016-10-14T00:42:53.768741Z] 00:42:53     INFO -   3  libgobject-2.0.so.0.3200.4!g_closure_invoke [gclosure.c : 777 + 0x11]
[task 2016-10-14T00:42:53.778678Z] 00:42:53     INFO -   4  libgobject-2.0.so.0.3200.4!signal_emit_unlocked_R [gsignal.c : 3551 + 0x5]
[task 2016-10-14T00:42:53.788061Z] 00:42:53     INFO -   5  libgobject-2.0.so.0.3200.4!g_signal_emit_valist [gsignal.c : 3310 + 0x20]
[task 2016-10-14T00:42:53.801054Z] 00:42:53     INFO -   6  libgobject-2.0.so.0.3200.4!g_signal_emit [gsignal.c : 3356 + 0x5]
[task 2016-10-14T00:42:53.811110Z] 00:42:53     INFO -   7  libgtk-x11-2.0.so.0.2400.10 + 0x251231
[task 2016-10-14T00:42:53.824198Z] 00:42:53     INFO -   8  libgtk-x11-2.0.so.0.2400.10 + 0x16a147
[task 2016-10-14T00:42:53.828817Z] 00:42:53     INFO -   9  libgtk-x11-2.0.so.0.2400.10 + 0x2c92fa
[task 2016-10-14T00:42:53.833476Z] 00:42:53     INFO -  10  libgdk-x11-2.0.so.0.2400.10 + 0x597c1
[task 2016-10-14T00:42:53.840339Z] 00:42:53     INFO -  11  libgdk-x11-2.0.so.0.2400.10 + 0x5a2aa
[task 2016-10-14T00:42:53.845529Z] 00:42:53     INFO -  12  plugin-container!je_free [jemalloc.c:609a153be0a5 : 4615 + 0x8]
[task 2016-10-14T00:42:53.851260Z] 00:42:53     INFO -  13  libxcb.so.1.1.0!xcb_wait_for_event + 0x1ed
[task 2016-10-14T00:42:53.862451Z] 00:42:53     INFO -  14  0x18
[task 2016-10-14T00:42:53.871908Z] 00:42:53     INFO -  15  libglib-2.0.so.0.3200.4!g_private_get [gthread-posix.c : 1003 + 0x5]
[task 2016-10-14T00:42:53.876466Z] 00:42:53     INFO -  16  libglib-2.0.so.0.3200.4!g_slice_alloc [gslice.c : 500 + 0x5]
[task 2016-10-14T00:42:53.881212Z] 00:42:53     INFO -  17  0x7f2eb76910a0
[task 2016-10-14T00:42:53.893214Z] 00:42:53     INFO -  18  libgdk-x11-2.0.so.0.2400.10 + 0x5bd96
[task 2016-10-14T00:42:53.898480Z] 00:42:53     INFO -  19  libglib-2.0.so.0.3200.4 + 0x54f00
[task 2016-10-14T00:42:53.903719Z] 00:42:53     INFO -  20  libgdk-x11-2.0.so.0.2400.10 + 0x5be3e
[task 2016-10-14T00:42:53.912165Z] 00:42:53     INFO -  21  libglib-2.0.so.0.3200.4!g_main_context_dispatch [gmain.c : 2539 + 0x4]
[task 2016-10-14T00:42:53.916923Z] 00:42:53     INFO -  22  libglib-2.0.so.0.3200.4!g_main_context_iterate [gmain.c : 3146 + 0x8]
[task 2016-10-14T00:42:53.926941Z] 00:42:53     INFO -  23  libglib-2.0.so.0.3200.4!g_main_context_iteration [gmain.c : 3207 + 0x14]
[task 2016-10-14T00:42:53.940995Z] 00:42:53     INFO -  24  libxul.so!base::MessagePumpForUI::RunWithDispatcher [message_pump_glib.cc:609a153be0a5 : 195 + 0x9]
[task 2016-10-14T00:42:53.950424Z] 00:42:53     INFO -  25  libxul.so!MessageLoop::Run [message_loop.cc:609a153be0a5 : 232 + 0x6]
[task 2016-10-14T00:42:53.970072Z] 00:42:53     INFO -  26  libxul.so!XRE_InitChildProcess [nsEmbedFunctions.cpp:609a153be0a5 : 701 + 0x8]
[task 2016-10-14T00:42:53.979452Z] 00:42:53     INFO -  27  plugin-container!content_process_main [plugin-container.cpp:609a153be0a5 : 197 + 0xf]

This is probably a broken stack; there's definitely some stack scanning going on to generate this stack trace.

Since this is Linux, reproducing and debugging this (with RR even!) should be easier than the typical windows deadlock, but I still don't have anyone to work on this soon. And it might just go away in 53 when we stop supporting windowed-mode plugins altogether.
Component: DOM → Plug-ins
Flags: needinfo?(benjamin)
this is still a problem and reproducing frequently enough that it falls into our high frequency failure bucket.  This looks to be linux64 and e10s primarily if that helps.  

:bsmedberg, can you help find someone to work on this?
Flags: needinfo?(benjamin)
Diagnostics based on the log at https://archive.mozilla.org/pub/firefox/tinderbox-builds/autoland-linux64-pgo/1483367431/autoland_ubuntu64_vm_test_pgo-mochitest-e10s-2-bm122-tests1-linux64-build9.txt.gz

There is this buried near the beginning of the log for this test:
08:25:53     INFO - [NPAPI 4121] ###!!! ASSERTION: plug removed: 'glib assertion', file /builds/slave/autoland-l64-pgo-0000000000000/build/src/toolkit/xre/nsSigHandlers.cpp, line 140

chrome process seems relatively unstuck, is processing a nsAutoFocusEvent::Run (signaled SIGABRT from the harness)
a content process is blocked calling NPP_Destroy -> RPC (signaled SIGABRT from the harness)
the plugin process is calling g_logv. It says it's getting a SIGTRAP and not the normal SIGABRT we'd get from a harness abort, so it's possible that it's actually crashing and not hanging as I might have expected:

08:34:02     INFO - Crash reason:  SIGTRAP
08:34:02     INFO - Crash address: 0x0
08:34:02     INFO - Process uptime: not available
08:34:02     INFO - 
08:34:02     INFO - Thread 0 (crashed)
08:34:02     INFO -  0  libglib-2.0.so.0.3200.1!g_logv [gmessages.c : 765 + 0x0]
08:34:02     INFO -  1  libglib-2.0.so.0.3200.1!g_log [gmessages.c : 792 + 0x5]
08:34:02     INFO -  2  libnptest.so + 0x70f4
08:34:02     INFO -     Found by: call frame info
08:34:02     INFO -  3  libgtk-x11-2.0.so.0.2400.10!_gtk_marshal_BOOLEAN__BOXED [gtkmarshalers.c : 86 + 0xa]
08:34:02     INFO -     Found by: stack scanning
08:34:02     INFO -  4  libgobject-2.0.so.0.3200.1!g_closure_invoke [gclosure.c : 777 + 0x11]
08:34:02     INFO -  5  libgobject-2.0.so.0.3200.1!signal_emit_unlocked_R [gsignal.c : 3547 + 0x5]
08:34:02     INFO -  6  libgobject-2.0.so.0.3200.1!g_signal_emit_valist [gsignal.c : 3306 + 0x20]
08:34:02     INFO -  7  libgobject-2.0.so.0.3200.1!g_signal_emit [gsignal.c : 3352 + 0x5]
08:34:02     INFO -  8  libgtk-x11-2.0.so.0.2400.10!gtk_widget_event_internal [gtkwidget.c : 4992 + 0x21]
08:34:02     INFO -  9  libgtk-x11-2.0.so.0.2400.10!_gtk_plug_send_delete_event [gtkplug.c : 382 + 0xb]
08:34:02     INFO - 10  libgtk-x11-2.0.so.0.2400.10!_gtk_plug_windowing_filter_func [gtkplug-x11.c : 280 + 0x8]
08:34:02     INFO - 11  libgdk-x11-2.0.so.0.2400.10!gdk_event_translate [gdkevents-x11.c : 356 + 0x9]
08:34:02     INFO - 12  libgdk-x11-2.0.so.0.2400.10!_gdk_events_queue [gdkevents-x11.c : 2310 + 0x5]
08:34:02     INFO - 13  libgdk-x11-2.0.so.0.2400.10!gdk_event_dispatch [gdkevents-x11.c : 2371 + 0x8]
08:34:02     INFO - 14  libglib-2.0.so.0.3200.1!g_main_context_dispatch [gmain.c : 2515 + 0x4]
08:34:03     INFO - 15  libglib-2.0.so.0.3200.1!g_main_context_iterate [gmain.c : 3123 + 0x8]
08:34:03     INFO - 16  libglib-2.0.so.0.3200.1!g_main_context_iteration [gmain.c : 3184 + 0x14]
08:34:03     INFO - 17  libxul.so!base::MessagePumpForUI::RunWithDispatcher [message_pump_glib.cc:bafb20dd3221 : 195 + 0x8]
08:34:03     INFO - 18  libxul.so!MessageLoop::Run() + 0x47
08:34:03     INFO - 19  libxul.so!XRE_InitChildProcess [nsEmbedFunctions.cpp:bafb20dd3221 : 756 + 0xc]
08:34:03     INFO - 20  plugin-container!content_process_main [plugin-container.cpp:bafb20dd3221 : 115 + 0xe]

Because of bug 1297387 we don't have symbols for the test plugin, so I don't know for certain, but reading through nptest_gtk2.cpp I rather expect this is the abort at http://searchfox.org/mozilla-central/rev/8144fcc62054e278fe5db9666815cc13188c96fa/dom/plugins/test/testplugin/nptest_gtk2.cpp#264

jimm, what do you think about forcing Flash to windowless mode on Linux so we can just get rid of all the windowed-mode plug stuff there? Adobe doesn't support stage3d on Linux anyway, so I really doubt there would be functional regressions to worry about.

karlt, what's the possibility that e10s+plugins has screwed up the ordering of GTK plug teardown for windowed plugins?
Flags: needinfo?(karlt)
Flags: needinfo?(jmathies)
Flags: needinfo?(benjamin)
(In reply to Benjamin Smedberg [:bsmedberg] from comment #49)
> karlt, what's the possibility that e10s+plugins has screwed up the ordering
> of GTK plug teardown for windowed plugins?

Yes, that was suspected in https://bugzilla.mozilla.org/show_bug.cgi?id=1237853#c5 and Jim's subsequent comment acknowledged that that can happen.
Depends on: 1237853
Flags: needinfo?(karlt)
Summary: Intermittent test_fullscreen-api.html | application timed out after 330 seconds with no output → Intermittent e10s test_fullscreen-api.html | application timed out after 330 seconds with no output
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → DUPLICATE
Flags: needinfo?(jmathies)
We can reopen this bug, or we can change the summary of the other one to include the test filename and the intermittent-failure keyword, or we can disable the test. Having the bug for this frequent a failure be closed is not one of the choices.
Nevermind, we already refiled it.
Whiteboard: [stockwell unknown]
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.