[e10s][3.8 <= gtk version < 3.14.8] Firefox unresponsive spinning cpu when attempting to open select dropdown

RESOLVED FIXED in Firefox 46

Status

()

defect
RESOLVED FIXED
4 years ago
3 years ago

People

(Reporter: sylvestre, Assigned: karlt)

Tracking

(Blocks 1 bug)

Trunk
mozilla47
All
Linux
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(e10sm9+, firefox43 wontfix, firefox44 wontfix, firefox45 disabled, firefox46+ fixed, firefox47+ fixed)

Details

Attachments

(2 attachments)

Reporter

Description

4 years ago
Running Firefox 42 in aurora on Debian with gtk 3.14.5, when using some select dropdown (example: tracking flags in Bugzilla), Firefox freeze.

I haven't been able to find some STR. I just disabled e10s to find out if it is caused by gtk3 or e10s.
Flags: needinfo?(twalker)
Keywords: steps-wanted
I'll try to figure out how to get gtk3 installed on my ubuntu VM.  

In the mean time, Sylvestre, let us know what you figure out with e10s. Also, please provide reliable STR's, if you're able to observe it happen regularly.
Flags: needinfo?(sledru)
Reporter

Comment 2

4 years ago
OK, I haven't been able to reproduce it yet... :/
Flags: needinfo?(sledru)
I can often reproduce this, but without any clear STR. When that happens, maybe I can collect some information for you (I can use gdb or any other tool), if you tell me what do you need? For what it's worth, I'm using a nightly optimized version on gnu/linux mint (latest release).
Benjamin,

Even just the site and select dropdown you have see this with is a start.
Flags: needinfo?(benj)
It happened several times on Bugzilla: on a bug's page (like this one), clicking the Component dropdown caused the freezing.

Next time this happens to me, I'll carefully note down the website. This has happened to me a few times already, not only with Bugzilla, but also with other websites (can't remember which, though).
Flags: needinfo?(benj)
I am unable to reproduce this on Ubuntu 14.04.2 (VM) with gtk3.  Selects don't cause any browser hanging at all.  Perhaps there are distro variations that are involved here?  Sylvestre reports it against Debian and Benjamin against Mint.
Flags: needinfo?(twalker)
Reporter

Comment 7

4 years ago
I didn't get it for the last week on aurora... sorry for the lack of str.
Got it twice on Nightly yesterday, both times on fastmail.com (calendar section, create a new event, choose a reminder in the dropdown list), but the third time, it disappeared. Clearly an intermittent issue...
Hey bbouvier, are you willing to help us get a stack for this hang?

You'd need to get a debug build of Nightly:

https://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2015-09-15-mozilla-central-debug/firefox-43.0a1.en-US.debug-linux-x86_64.tar.bz2

Run it (it's going to be kinda slow, since it's a debug build) and make note of the process ID (you might need to use top to find it, or ps).

Reproduce the hang, and when you've hit it, attach a debugger in the terminal like this:

sudo gdb
[enter your user password]

And then in gdb:

attach [process id]

(no square brackets)

and then once it has attached, type in:

bt

To get a backtrace. That should spew out a bunch of stuff that lets us know where Firefox is stuck. Can you copy and paste that stuff into this bug?
Flags: needinfo?(benj)
I could reproduce the bug on Fastmail after a lot of retries, but the build you've provided doesn't contain debug symbols... I will build my own debug build later this afternoon, I have to test against one anyways.

main process ('firefox') backtrace:
#0  0x00007ff42562961d in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#1  0x00007ff425629979 in g_mutex_lock () from
/lib/x86_64-linux-gnu/libglib-2.0.so.0
#2  0x00007ff4255e7f99 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#3  0x00007ff4255e80ec in g_main_context_iteration () from
/lib/x86_64-linux-gnu/libglib-2.0.so.0
#4  0x00007ff41d9ddbc5 in ?? () from
/home/ben/Downloads/aaa/firefox/libxul.so
#5  0x00007ff41d9b01d2 in ?? () from
/home/ben/Downloads/aaa/firefox/libxul.so
#6  0x00007ff41d9b043e in ?? () from
/home/ben/Downloads/aaa/firefox/libxul.so
#7  0x00007ff41d9b059c in ?? () from
/home/ben/Downloads/aaa/firefox/libxul.so
#8  0x00007ff41bc92831 in ?? () from
/home/ben/Downloads/aaa/firefox/libxul.so
#9  0x00007ff41bcc0e87 in ?? () from
/home/ben/Downloads/aaa/firefox/libxul.so
#10 0x00007ff41bf9eab8 in ?? () from
/home/ben/Downloads/aaa/firefox/libxul.so
#11 0x00007ff41bf7d5a8 in ?? () from
/home/ben/Downloads/aaa/firefox/libxul.so
#12 0x00007ff41d9b021d in ?? () from
/home/ben/Downloads/aaa/firefox/libxul.so
#13 0x00007ff41e240a28 in ?? () from
/home/ben/Downloads/aaa/firefox/libxul.so
#14 0x00007ff41e2a8e67 in ?? () from
/home/ben/Downloads/aaa/firefox/libxul.so
#15 0x00007ff41e2a9511 in ?? () from
/home/ben/Downloads/aaa/firefox/libxul.so
#16 0x00007ff41e2a9a75 in XRE_main () from
/home/ben/Downloads/aaa/firefox/libxul.so
#17 0x0000000000405e8e in _start ()


plugin-container:
#0  0x00007f13a160812d in poll () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f13a4c95b74 in ?? () from
/home/ben/Downloads/aaa/firefox/libxul.so
#2  0x00007f139e78ffe4 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#3  0x00007f139e7900ec in g_main_context_iteration () from
/lib/x86_64-linux-gnu/libglib-2.0.so.0
#4  0x00007f13a4c95bc5 in ?? () from
/home/ben/Downloads/aaa/firefox/libxul.so
#5  0x00007f13a4c681d2 in ?? () from
/home/ben/Downloads/aaa/firefox/libxul.so
#6  0x00007f13a4c684a4 in ?? () from
/home/ben/Downloads/aaa/firefox/libxul.so
#7  0x00007f13a4c6859c in ?? () from
/home/ben/Downloads/aaa/firefox/libxul.so
#8  0x00007f13a2f4a831 in ?? () from
/home/ben/Downloads/aaa/firefox/libxul.so
#9  0x00007f13a2f78e87 in ?? () from
/home/ben/Downloads/aaa/firefox/libxul.so
#10 0x00007f13a3256ab8 in ?? () from
/home/ben/Downloads/aaa/firefox/libxul.so
#11 0x00007f13a32355a8 in ?? () from
/home/ben/Downloads/aaa/firefox/libxul.so
#12 0x00007f13a4c6821d in ?? () from
/home/ben/Downloads/aaa/firefox/libxul.so
#13 0x00007f13a5564404 in XRE_RunAppShell () from
/home/ben/Downloads/aaa/firefox/libxul.so
#14 0x00007f13a3257096 in ?? () from
/home/ben/Downloads/aaa/firefox/libxul.so
#15 0x00007f13a32355a8 in ?? () from
/home/ben/Downloads/aaa/firefox/libxul.so
#16 0x00007f13a5563e8e in XRE_InitChildProcess () from
/home/ben/Downloads/aaa/firefox/libxul.so
#17 0x00000000004096b7 in ?? ()
#18 0x00007f13a153cec5 in __libc_start_main (main=0x409735, argc=10,
argv=0x7fff16973938, init=<optimized out>, fini=<optimized out>, 
    rtld_fini=<optimized out>, stack_end=0x7fff16973928) at libc-start.c:287
#19 0x0000000000409229 in _start ()
Local debug build, based off inbound 262766:2590668bd232 inbound

firefox:
#0  0x00007fcf8fde435d in write () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fcf809c9fef in nsAppShell::ScheduleNativeEventCallback (this=0x7fcf75aa2bb0) at /home/ben/code/moz/repo/widget/gtk/nsAppShell.cpp:168
#2  0x00007fcf809788ee in nsBaseAppShell::OnDispatchedEvent (this=0x7fcf75aa2bb0, thr=0x7fcf8eaea300) at /home/ben/code/moz/repo/widget/nsBaseAppShell.cpp:227
#3  0x00007fcf8097892f in non-virtual thunk to nsBaseAppShell::OnDispatchedEvent(nsIThreadInternal*) () at Unified_cpp_widget1.cpp:229
#4  0x00007fcf7cee3242 in nsThread::PutEvent(already_AddRefed<nsIRunnable>&&, nsThread::nsNestedEventTarget*) (this=0x7fcf8eaea300, 
    aEvent=<unknown type in /home/ben/code/moz/builds/browser-d64/dist/bin/libxul.so, CU 0x27b2d7, DIE 0x2b303e>, aTarget=0x0)
    at /home/ben/code/moz/repo/xpcom/threads/nsThread.cpp:567
#5  0x00007fcf7cee3564 in nsThread::DispatchInternal(already_AddRefed<nsIRunnable>&&, unsigned int, nsThread::nsNestedEventTarget*) (this=0x7fcf8eaea300, 
    aEvent=<unknown type in /home/ben/code/moz/builds/browser-d64/dist/bin/libxul.so, CU 0x27b2d7, DIE 0x2b3112>, aFlags=0, aTarget=0x0)
    at /home/ben/code/moz/repo/xpcom/threads/nsThread.cpp:619
#6  0x00007fcf7cee37d0 in nsThread::Dispatch(already_AddRefed<nsIRunnable>&&, unsigned int) (this=0x7fcf8eaea300, 
    aEvent=<unknown type in /home/ben/code/moz/builds/browser-d64/dist/bin/libxul.so, CU 0x27b2d7, DIE 0x2b3253>, aFlags=0)
    at /home/ben/code/moz/repo/xpcom/threads/nsThread.cpp:637
#7  0x00007fcf7ce9936e in nsIEventTarget::Dispatch (this=0x7fcf8eaea300, aEvent=0x7fcf4f820000, aFlags=0) at ../../dist/include/nsIEventTarget.h:37
#8  0x00007fcf80978c8d in nsBaseAppShell::DispatchDummyEvent (this=0x7fcf75aa2bb0, aTarget=0x7fcf8eaea300) at /home/ben/code/moz/repo/widget/nsBaseAppShell.cpp:313
#9  0x00007fcf80978b8e in nsBaseAppShell::OnProcessNextEvent (this=0x7fcf75aa2bb0, thr=0x7fcf8eaea300, mayWait=true)
    at /home/ben/code/moz/repo/widget/nsBaseAppShell.cpp:299
#10 0x00007fcf80978d0e in non-virtual thunk to nsBaseAppShell::OnProcessNextEvent(nsIThreadInternal*, bool) () at Unified_cpp_widget1.cpp:303
#11 0x00007fcf7cee45b6 in nsThread::ProcessNextEvent (this=0x7fcf8eaea300, aMayWait=true, aResult=0x7fff5a94702e)
    at /home/ben/code/moz/repo/xpcom/threads/nsThread.cpp:922
#12 0x00007fcf7cf5bfb7 in NS_ProcessNextEvent (aThread=0x7fcf8eaea300, aMayWait=true) at /home/ben/code/moz/repo/xpcom/glue/nsThreadUtils.cpp:277
#13 0x00007fcf7d583419 in mozilla::ipc::MessagePump::Run (this=0x7fcf8eae5bc0, aDelegate=0x7fcf8ea4d880) at /home/ben/code/moz/repo/ipc/glue/MessagePump.cpp:127
#14 0x00007fcf7d507e35 in MessageLoop::RunInternal (this=0x7fcf8ea4d880) at /home/ben/code/moz/repo/ipc/chromium/src/base/message_loop.cc:234
#15 0x00007fcf7d507d65 in MessageLoop::RunHandler (this=0x7fcf8ea4d880) at /home/ben/code/moz/repo/ipc/chromium/src/base/message_loop.cc:227
#16 0x00007fcf7d507d3d in MessageLoop::Run (this=0x7fcf8ea4d880) at /home/ben/code/moz/repo/ipc/chromium/src/base/message_loop.cc:201
#17 0x00007fcf80978653 in nsBaseAppShell::Run (this=0x7fcf75aa2bb0) at /home/ben/code/moz/repo/widget/nsBaseAppShell.cpp:156
#18 0x00007fcf81938227 in nsAppStartup::Run (this=0x7fcf75b1d300) at /home/ben/code/moz/repo/toolkit/components/startup/nsAppStartup.cpp:281
#19 0x00007fcf81a0de45 in XREMain::XRE_mainRun (this=0x7fff5a9478d0) at /home/ben/code/moz/repo/toolkit/xre/nsAppRunner.cpp:4291
#20 0x00007fcf81a0e759 in XREMain::XRE_main (this=0x7fff5a9478d0, argc=4, argv=0x7fff5a948eb8, aAppData=0x7fff5a947b70)
    at /home/ben/code/moz/repo/toolkit/xre/nsAppRunner.cpp:4384
#21 0x00007fcf81a0efe4 in XRE_main (argc=4, argv=0x7fff5a948eb8, aAppData=0x7fff5a947b70, aFlags=0) at /home/ben/code/moz/repo/toolkit/xre/nsAppRunner.cpp:4486
#22 0x0000000000406200 in do_main (argc=4, argv=0x7fff5a948eb8, xreDirectory=0x7fcf8ea39340) at /home/ben/code/moz/repo/browser/app/nsBrowserApp.cpp:212
#23 0x000000000040593c in main (argc=4, argv=0x7fff5a948eb8) at /home/ben/code/moz/repo/browser/app/nsBrowserApp.cpp:399


plugin-container:
#0  0x00007f69b899312d in poll () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f69be023f9a in PollWrapper (ufds=0x7f69a2ddf710, nfsd=5, timeout_=-1) at /home/ben/code/moz/repo/widget/gtk/nsAppShell.cpp:42
#2  0x00007f69b61f4fe4 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#3  0x00007f69b61f50ec in g_main_context_iteration () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#4  0x00007f69be02403e in nsAppShell::ProcessNextNativeEvent (this=0x7f69af844dd0, mayWait=true) at /home/ben/code/moz/repo/widget/gtk/nsAppShell.cpp:174
#5  0x00007f69bdfd2596 in nsBaseAppShell::DoProcessNextNativeEvent (this=0x7f69af844dd0, mayWait=true) at /home/ben/code/moz/repo/widget/nsBaseAppShell.cpp:138
#6  0x00007f69bdfd2b25 in nsBaseAppShell::OnProcessNextEvent (this=0x7f69af844dd0, thr=0x7f69af8c5000, mayWait=true)
    at /home/ben/code/moz/repo/widget/nsBaseAppShell.cpp:289
#7  0x00007f69bdfd2d0e in non-virtual thunk to nsBaseAppShell::OnProcessNextEvent(nsIThreadInternal*, bool) () at Unified_cpp_widget1.cpp:303
#8  0x00007f69ba53e5b6 in nsThread::ProcessNextEvent (this=0x7f69af8c5000, aMayWait=true, aResult=0x7fff30a5b25e)
    at /home/ben/code/moz/repo/xpcom/threads/nsThread.cpp:922
#9  0x00007f69ba5b5fb7 in NS_ProcessNextEvent (aThread=0x7f69af8c5000, aMayWait=true) at /home/ben/code/moz/repo/xpcom/glue/nsThreadUtils.cpp:277
#10 0x00007f69babdd419 in mozilla::ipc::MessagePump::Run (this=0x7f69af824d70, aDelegate=0x7fff30a5b6a0) at /home/ben/code/moz/repo/ipc/glue/MessagePump.cpp:127
#11 0x00007f69babde1ab in mozilla::ipc::MessagePumpForChildProcess::Run (this=0x7f69af824d70, aDelegate=0x7fff30a5b6a0)
    at /home/ben/code/moz/repo/ipc/glue/MessagePump.cpp:289
#12 0x00007f69bab61e35 in MessageLoop::RunInternal (this=0x7fff30a5b6a0) at /home/ben/code/moz/repo/ipc/chromium/src/base/message_loop.cc:234
#13 0x00007f69bab61d65 in MessageLoop::RunHandler (this=0x7fff30a5b6a0) at /home/ben/code/moz/repo/ipc/chromium/src/base/message_loop.cc:227
#14 0x00007f69bab61d3d in MessageLoop::Run (this=0x7fff30a5b6a0) at /home/ben/code/moz/repo/ipc/chromium/src/base/message_loop.cc:201
#15 0x00007f69bdfd2653 in nsBaseAppShell::Run (this=0x7f69af844dd0) at /home/ben/code/moz/repo/widget/nsBaseAppShell.cpp:156
#16 0x00007f69bf06f343 in XRE_RunAppShell () at /home/ben/code/moz/repo/toolkit/xre/nsEmbedFunctions.cpp:785
#17 0x00007f69babddfc6 in mozilla::ipc::MessagePumpForChildProcess::Run (this=0x7f69af824d70, aDelegate=0x7fff30a5b6a0)
    at /home/ben/code/moz/repo/ipc/glue/MessagePump.cpp:259
#18 0x00007f69bab61e35 in MessageLoop::RunInternal (this=0x7fff30a5b6a0) at /home/ben/code/moz/repo/ipc/chromium/src/base/message_loop.cc:234
#19 0x00007f69bab61d65 in MessageLoop::RunHandler (this=0x7fff30a5b6a0) at /home/ben/code/moz/repo/ipc/chromium/src/base/message_loop.cc:227
#20 0x00007f69bab61d3d in MessageLoop::Run (this=0x7fff30a5b6a0) at /home/ben/code/moz/repo/ipc/chromium/src/base/message_loop.cc:201
#21 0x00007f69bf06eb9c in XRE_InitChildProcess (aArgc=3, aArgv=0x7fff30a5cc88, aGMPLoader=0x0) at /home/ben/code/moz/repo/toolkit/xre/nsEmbedFunctions.cpp:621
#22 0x00000000004182d3 in content_process_main (argc=5, argv=0x7fff30a5cc88) at /home/ben/code/moz/repo/ipc/app/../contentproc/plugin-container.cpp:237
#23 0x00000000004183c2 in main (argc=6, argv=0x7fff30a5cc88) at /home/ben/code/moz/repo/ipc/app/MozillaRuntimeMain.cpp:11
Flags: needinfo?(benj) → needinfo?(mconley)
Comment hidden (obsolete)
Comment hidden (obsolete)
Seems low volume based on our less than reliable data. If we can get STR, please renom
tracking-e10s: --- → +
(In reply to Brad Lassey [:blassey] (use needinfo?) from comment #14)
> Seems low volume based on our less than reliable data. If we can get STR,
> please renom

For what it's worth, the STR in comment 8 reproduces quite easily for me (maybe with a 3% rate, so I can reproduce in less than 2 minutes usually). If anybody were willing to submit a patch, I could test for some time and confirm whether it fixes the issue on my machine or not.
Reporter

Comment 16

3 years ago
Tracking because it seems to affect a bunch of users and I don't think we can release gtk 3 with this bug.
Reporter

Comment 17

3 years ago
Karl, a few people pinged me about this bug and shared their frustrations. Is there anything we could do here? Thanks
Steps are in comment #15.
Flags: needinfo?(karlt)
Keywords: steps-wanted
Assignee

Comment 18

3 years ago
(In reply to Karl Tomlinson (ni?:karlt) from comment #13)
> (In reply to Benjamin Bouvier [:bbouvier] from comment #11)
> > Local debug build, based off inbound 262766:2590668bd232 inbound
> > 
> > firefox:
> > #0  0x00007fcf8fde435d in write () at ../sysdeps/unix/syscall-template.S:81
> > #1  0x00007fcf809c9fef in nsAppShell::ScheduleNativeEventCallback
> 
> If blocking on write, perhaps the pipe is full.
> It is read in the same process, so care needs to be taken not to write too
> much.

There is code to take care of that, so either the pipe is not full (and it may not be blocking on write), or the code is not behaving as intended:

https://dxr.mozilla.org/mozilla-central/rev/4e2224c009dfedfcd95035e2fc67779567c2cdea/widget/nsBaseAppShell.cpp#223
https://dxr.mozilla.org/mozilla-central/rev/4e2224c009dfedfcd95035e2fc67779567c2cdea/widget/nsBaseAppShell.cpp#66

(In reply to Sylvestre Ledru [:sylvestre] from comment #17)
> Karl, a few people pinged me about this bug and shared their frustrations.
> Is there anything we could do here? Thanks
> Steps are in comment #15.

It's best if someone who can reproduce to debug.  I don't know whether I can reproduce but I won't have time to look at this before I'm on vacation.

Catching in rr is the best way to debug.  rr is not as scary as it sounds, but other means can also be used if it is easy enough to reproduce.

The first thing I'd investigate is whether that write is blocking.

If there's any cpu use, then using top with capital 'H' for threads, would be helpful for identifying any threads spinning and to which process they belong.

What is observed when the hang happens?
Is the select shown?
Does the mouse still send events to other processes, or does Firefox need to be killed first?

(In reply to Sylvestre Ledru [:sylvestre] from comment #16)
> Tracking because it seems to affect a bunch of users and I don't think we
> can release gtk 3 with this bug.

The only reports I've seen involve e10s.  If there's no evidence of problems without e10s then there is no need to block gtk3 port release.
Component: General → Widget: Gtk
Flags: needinfo?(karlt)
Product: Firefox → Core

Comment 19

3 years ago
When this happens for me the select is not shown.
Re-nomming - this is, apparently, a super-frequent hang that our e10s users with GTK3 are experiencing. At the very least, we probably want :canuckistani weighing in on this.
I run into this often and mconley helped me debug it a bit.  5 times today I hit it- woohoo for spinning the cpu!

after looking in gdb, there wasn't a lot there, but to confirm this is the same issue:
https://pastebin.mozilla.org/8857944

looking at top -H, I see this:
https://pastebin.mozilla.org/8857942

In addition a few other threads wake up:
Web Content, DOM Worker, URL Classifier, MOZStorage #5

this can be reproduced fairly easily, so if you need more information or want me to try a custom build, let me know.
Welcome back from vacation, karlt. :) Is comment 21 enough information, or do you need more?
Flags: needinfo?(karlt)
Assignee

Comment 23

3 years ago
(In reply to Mike Conley (:mconley) - Needinfo me! from comment #22)
> Is comment 21 enough information, or
> do you need more?

That's helpful thanks, but I don't know what is happening.

(In reply to Joel Maher (:jmaher) from comment #21)
>  9587 root  20  0 1247508 299740  66500 R 99.3  1.9  3:37.64 firefox-trunk

Assuming that is the main thread, it seems to be spinning rather than
blocking, ruling out the hypothesis in comment 13.

> #0  0x00007ffff229661a in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
> #1  0x00007ffff2296979 in g_mutex_lock () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
> #2  0x00007ffff2254699 in g_main_context_prepare () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
> #3  0x00007ffff2254f03 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
> #4  0x00007ffff22550ec in g_main_context_iteration () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
> #5  0x00007fffeafe9b11 in nsAppShell::ProcessNextNativeEvent (this=<optimized out>, mayWait=<optimized out>)

And this supports that thanks
(because this is not in the write() of comment 11, which I should have seen from
comment 10).

I don't know the purpose of the DispatchDummyEvent code.  When mFavorPerf is
positive, I wonder whether what if anything prevents that triggering
mProcessedGeckoEvents to exit the nsBaseAppShell::OnProcessNextEvent loop
again with another dummy event.

Using the gdb command "up" to select a stack frame in
nsBaseAppShell::OnProcessNextEvent(), what does "print mFavorPerf" show?

What I usually do from gdb is "frame 0" and then run the "finish" command to
see whether the currently running function completes.  If it does, then I
repeat up each call in the stack to find which function is looping
indefinitely.

The code should be waiting on a select or poll call somewhere (and process
events as they are received) instead of spinning the cpu.

> Steps are in comment #15.

A fastmail account may be a bit inconvenient.  Bugzilla tracking flags may be
easier to access.  I haven't reproduced with
data:text/html,<select><option>One<option>Two
Flags: needinfo?(karlt)
Summary: [e10s?][gtk3?] Firefox freezes when using some select dropdown → [e10s?][gtk3?] Firefox unresponsive spinning cpu when attempting to open select dropdown
Needinfoing jmaher for karlt's questions in comment 23.
Flags: needinfo?(jmaher)
we did some debugging via irc- karlt indicated it might be best to send me a build with better instrumentation in it and I can reproduce it.
Flags: needinfo?(jmaher)
Assignee

Comment 26

3 years ago
Joel's STR were:

1. load http://alertmanager.allizom.org:8080/alerts.html?rev=b9a803752a2cb143582e6665ed3fb679eebf60b3&showAll=1&testIndex=0&platIndex=0
2. Open 7 "graphurl" links in new tabs.
3. In each of the new tabs, change the select showing "last 7 days" and change
   it to 90 days.

It seems that the browser needs to be busy in some way to trigger the infinite
loop.  The steps above reproduce within a few attempts for Joel when e10s is
enabled, but hasn't reproduced without e10s.

I haven't been able to reproduce (with e10s enabled).

Joel initially caught this at

#0  0x00007ffff6ec012d in poll () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fffeafe9acb in PollWrapper (ufds=0x7fffd2e804f0, nfsd=5, timeout_=0) at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/widget/gtk/nsAppShell.cpp:42
#2  0x00007ffff2254fe4 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#3  0x00007ffff22550ec in g_main_context_iteration () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#4  0x00007fffeafe9b11 in nsAppShell::ProcessNextNativeEvent (this=<optimized out>, mayWait=<optimized out>)
    at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/widget/gtk/nsAppShell.cpp:212
#5  0x00007fffeafcb0cb in nsBaseAppShell::DoProcessNextNativeEvent (this=this@entry=0x7fffdc312740, mayWait=mayWait@entry=false)
    at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/widget/nsBaseAppShell.cpp:138
#6  0x00007fffeafcc498 in nsBaseAppShell::OnProcessNextEvent (this=0x7fffdc312740, thr=0x7ffff6bb80e0, mayWait=false)
    at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/widget/nsBaseAppShell.cpp:271
#7  0x00007fffe9d38f7e in nsThread::ProcessNextEvent (this=0x7ffff6bb80e0, aMayWait=<optimized out>, aResult=0x7fffffffceff)
    at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/xpcom/threads/nsThread.cpp:964
#8  0x00007fffe9d543de in NS_ProcessNextEvent (aThread=<optimized out>, aMayWait=aMayWait@entry=false)
    at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/xpcom/glue/nsThreadUtils.cpp:297
#9  0x00007fffe9f5366b in mozilla::ipc::MessagePump::Run (this=0x7fffe6b0cc80, aDelegate=0x7ffff6b70200)
    at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/ipc/glue/MessagePump.cpp:95
#10 0x00007fffe9f3e532 in RunHandler (this=0x7ffff6b70200) at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/ipc/chromium/src/base/message_loop.cc:227
#11 MessageLoop::Run (this=0x7ffff6b70200) at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/ipc/chromium/src/base/message_loop.cc:201
#12 0x00007fffeafc937d in nsBaseAppShell::Run (this=0x7fffdc312740) at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/widget/nsBaseAppShell.cpp:156
#13 0x00007fffeb5bcdd9 in nsAppStartup::Run (this=0x7fffdc314060) at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/toolkit/components/startup/nsAppStartup.cpp:281

Note that the poll timeout is zero, and so poll() returns immediately.
mayWait is false in NS_ProcessNextEvent() and OnProcessNextEvent().
mFavorPerf is 0.

DispatchDummyEvent() gets called when NS_ProcessNextEvent() is called again
from a different part of MessagePump::Run(), this time with mayWait true.

Run till exit from #0  mozilla::ipc::MessagePump::Run (this=0x7fffe6b0cc80, aDelegate=0x7ffff6b70200)
    at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/ipc/glue/MessagePump.cpp:96
[New Thread 0x7fffb6dfe700 (LWP 24065)]
[New Thread 0x7fffb90fe700 (LWP 24066)]
[New Thread 0x7fffb88fd700 (LWP 24067)]
[New Thread 0x7fffbb3ff700 (LWP 24068)]
[New Thread 0x7fffba5f1700 (LWP 24070)]
[New Thread 0x7fffb98ff700 (LWP 24071)]
[New Thread 0x7fffb80fc700 (LWP 24072)]
[New Thread 0x7fffb75ff700 (LWP 24073)]
[New Thread 0x7fffb65fd700 (LWP 24074)]
[New Thread 0x7fffaddff700 (LWP 24075)]
 
Breakpoint 1, nsBaseAppShell::OnProcessNextEvent (this=0x7fffdc312740, thr=0x7ffff6bb80e0, mayWait=true)
    at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/widget/nsBaseAppShell.cpp:299
299     /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/widget/nsBaseAppShell.cpp: No such file or directory.
(gdb) bt
#0  nsBaseAppShell::OnProcessNextEvent (this=0x7fffdc312740, thr=0x7ffff6bb80e0, mayWait=true)
    at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/widget/nsBaseAppShell.cpp:299
#1  0x00007fffe9d38f7e in nsThread::ProcessNextEvent (this=0x7ffff6bb80e0, aMayWait=<optimized out>, aResult=0x7fffffffceff)
    at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/xpcom/threads/nsThread.cpp:964
#2  0x00007fffe9d543de in NS_ProcessNextEvent (aThread=<optimized out>, aMayWait=aMayWait@entry=true)
    at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/xpcom/glue/nsThreadUtils.cpp:297
#3  0x00007fffe9f536ba in mozilla::ipc::MessagePump::Run (this=0x7fffe6b0cc80, aDelegate=0x7ffff6b70200)
    at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/ipc/glue/MessagePump.cpp:127
#4  0x00007fffe9f3e532 in RunHandler (this=0x7ffff6b70200) at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/ipc/chromium/src/base/message_loop.cc:227
#5  MessageLoop::Run (this=0x7ffff6b70200) at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/ipc/chromium/src/base/message_loop.cc:201
#6  0x00007fffeafc937d in nsBaseAppShell::Run (this=0x7fffdc312740) at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/widget/nsBaseAppShell.cpp:156
#7  0x00007fffeb5bcdd9 in nsAppStartup::Run (this=0x7fffdc314060) at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/toolkit/components/startup/nsAppStartup.cpp:281
Comment hidden (obsolete)
Comment hidden (obsolete)
Assignee

Comment 30

3 years ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=df168e63ff93

This build adds some logging.  It's not going to identify the precise cause,
but I'm hoping it might narrow down where we should look.

Run with NSPR_LOG_MODULES=BaseAppShell:5 firefox/firefox.

There will be heaps of output so don't redirect the log to a file.

Instead, I'm hoping that there will be a rapidly repeating pattern when this
bug happens.  Use ^S in the terminal to view the most recent output and look
for a pattern.  ^Q unfreezes the terminal.

This output may help to determine whether we have native or Gecko events
causing the main process to spin, but I suspect Gecko events.
However, it is possible that all this output disrupts timing and so it no
longer reproduces.

I expect we're going to want to look at what is in the Gecko event queue and
being added to the queue.  For the latter, a break point at the top of
nsBaseAppShell::OnDispatchedEvent() and "bt" and "continue" repeatedly from
there will identify stacks adding new events.  I can help you through this if
you have time next week.
Comment hidden (obsolete)
Comment hidden (obsolete)
Comment hidden (obsolete)
Assignee

Comment 34

3 years ago
Joel, would you be able to try the steps in the first few paragraphs of comment 30 please with the build at 
http://archive.mozilla.org/pub/firefox/try-builds/ktomlinson@mozilla.com-df168e63ff938ea8ef84af4c851f9197fb6755d6/try-linux64/
Flags: needinfo?(jmaher)
Assignee

Updated

3 years ago
Flags: needinfo?(karlt)
I can easily reproduce this failure with the build provided, but when doing NSPR_LOG_MODULES I am unable to reproduce it.

One thing I noticed while having nspr logging is that when switching between tabs the browser is a lot slower, so I cannot whip through the tabs and watch them load.

:karlt, I am not very available today/tomorrow, could we pick this up next week to debug over irc again?
Flags: needinfo?(jmaher)
Assignee

Comment 36

3 years ago
Thanks for trying the build, Joel.
If we can try again to debug next week, that would be very helpful thanks.
Assignee

Updated

3 years ago
See Also: → 1215170
Tracking for 46 since this sounds like a significant performance problem and we are aiming e10s at 46 release.
Ah, sorry not a performance problem, a firefox freezing completely problem.
Any more progress debugging this one?
Flags: needinfo?(jmaher)
Assignee

Comment 40

3 years ago
Gecko and some system events are being processed, but X events are not being
processed.  poll() is always returning immediately to indicate that the X
server connection is ready for reading.  There are already 6288 events that
have been read and are waiting on the Display.

GDK is choosing not to process events from the Display because its
event_pause_count is non-zero, but there is still an event source telling the
GLib main loop that the X server connection is ready for events to be
processed.

This code is GTK3 only.

Thanks to Joel for working through this.
Next step is to try to reproduce under rr to find what has left
event_pause_count set.

> poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=39, events=POLLIN}, {fd=3, events=POLLIN}, {fd=38, events=POLLIN}], 5, 0) = 1 ([{fd=4, revents=POLLIN}])
> poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=39, events=POLLIN}, {fd=3, events=POLLIN}, {fd=38, events=POLLIN}], 5, 0) = 1 ([{fd=4, revents=POLLIN}])
> poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=39, events=POLLIN}, {fd=3, events=POLLIN}, {fd=38, events=POLLIN}], 5, 0) = 1 ([{fd=4, revents=POLLIN}])
> poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=39, events=POLLIN}, {fd=3, events=POLLIN}, {fd=38, events=POLLIN}], 5, 4294967295) = 1 ([{fd=4, revents=POLLIN}])
> write(40, "\372", 1)                    = 1
> poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=39, events=POLLIN}, {fd=3, events=POLLIN}, {fd=38, events=POLLIN}], 5, 0) = 2 ([{fd=4, revents=POLLIN}, {fd=39, revents=POLLIN}])
> read(39, "\372", 1)                     = 1
> poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=39, events=POLLIN}, {fd=3, events=POLLIN}, {fd=38, events=POLLIN}], 5, 0) = 1 ([{fd=4, revents=POLLIN}])
> poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=39, events=POLLIN}, {fd=3, events=POLLIN}, {fd=38, events=POLLIN}], 5, 0) = 1 ([{fd=4, revents=POLLIN}])
> poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=39, events=POLLIN}, {fd=3, events=POLLIN}, {fd=38, events=POLLIN}], 5, 0) = 1 ([{fd=4, revents=POLLIN}])
> poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=39, events=POLLIN}, {fd=3, events=POLLIN}, {fd=38, events=POLLIN}], 5, 4294967295) = 1 ([{fd=4, revents=POLLIN}])
> write(40, "\372", 1)                    = 1


(gdb) p ((GdkEventSource*) source)->display->event_pause_count
$15 = 1
(gdb) 

(gdb) bt
#0  gdk_event_source_check (source=0x7ffff6be1e00) at /build/gtk+3.0-3sSotQ/gtk+3.0-3.10.8/./gdk/x11/gdkeventsource.c:300
#1  0x00007ffff2254a61 in g_main_context_check (context=context@entry=0x7ffff6b54080, max_priority=2147483647, fds=fds@entry=0x7fffced4dd90, n_fds=n_fds@entry=5)
    at /build/buildd/glib2.0-2.40.2/./glib/gmain.c:3575
#2  0x00007ffff2254f7b in g_main_context_iterate (context=context@entry=0x7ffff6b54080, block=block@entry=0, dispatch=dispatch@entry=1, self=<optimized out>)
    at /build/buildd/glib2.0-2.40.2/./glib/gmain.c:3731
#3  0x00007ffff22550ec in g_main_context_iteration (context=0x7ffff6b54080, may_block=0) at /build/buildd/glib2.0-2.40.2/./glib/gmain.c:3795
#4  0x00007fffeafe9b11 in nsAppShell::ProcessNextNativeEvent (this=<optimized out>, mayWait=<optimized out>)
    at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/widget/gtk/nsAppShell.cpp:212
#5  0x00007fffeafcb0cb in nsBaseAppShell::DoProcessNextNativeEvent (this=this@entry=0x7fffd9f2b560, mayWait=mayWait@entry=false)
    at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/widget/nsBaseAppShell.cpp:138
#6  0x00007fffeafcc498 in nsBaseAppShell::OnProcessNextEvent (this=0x7fffd9f2b560, thr=0x7ffff6bbb050, mayWait=false)
    at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/widget/nsBaseAppShell.cpp:271
#7  0x00007fffe9d38f7e in nsThread::ProcessNextEvent (this=0x7ffff6bbb050, aMayWait=<optimized out>, aResult=0x7fffffffc63f)
    at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/xpcom/threads/nsThread.cpp:964
#8  0x00007fffe9d543de in NS_ProcessNextEvent (aThread=<optimized out>, aMayWait=aMayWait@entry=false)
    at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/xpcom/glue/nsThreadUtils.cpp:297
#9  0x00007fffe9f5366b in mozilla::ipc::MessagePump::Run (this=0x7fffe717d580, aDelegate=0x7ffff6b70540)
    at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/ipc/glue/MessagePump.cpp:95
#10 0x00007fffe9f3e532 in RunHandler (this=0x7ffff6b70540) at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/ipc/chromium/src/base/message_loop.cc:227
#11 MessageLoop::Run (this=0x7ffff6b70540) at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/ipc/chromium/src/base/message_loop.cc:201
#12 0x00007fffeafc937d in nsBaseAppShell::Run (this=0x7fffd9f2b560) at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/widget/nsBaseAppShell.cpp:156
#13 0x00007fffeb5bcdd9 in nsAppStartup::Run (this=0x7fffd9f21100) at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/toolkit/components/startup/nsAppStartup.cpp:281
#14 0x00007fffeb5f4c92 in XREMain::XRE_mainRun (this=this@entry=0x7fffffffc8e8) at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/toolkit/xre/nsAppRunner.cpp:4327
#15 0x00007fffeb5f4f56 in XREMain::XRE_main (this=this@entry=0x7fffffffc8e8, argc=argc@entry=1, argv=argv@entry=0x7fffffffddf8, aAppData=aAppData@entry=0x7fffffffcaf8)
    at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/toolkit/xre/nsAppRunner.cpp:4424
#16 0x00007fffeb5f51ae in XRE_main (argc=1, argv=0x7fffffffddf8, aAppData=0x7fffffffcaf8, aFlags=<optimized out>)
    at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/toolkit/xre/nsAppRunner.cpp:4526
#17 0x0000555555559885 in do_main (argc=1, argv=0x7fffffffddf8, xreDirectory=0x7ffff6b53900)
    at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/browser/app/nsBrowserApp.cpp:212
#18 0x0000555555558f6d in main (argc=1, argv=0x7fffffffddf8) at /build/firefox-trunk-OOjmGS/firefox-trunk-47.0~a1~hg20160125r281515/browser/app/nsBrowserApp.cpp:352
Blocks: ship-gtk3
Summary: [e10s?][gtk3?] Firefox unresponsive spinning cpu when attempting to open select dropdown → [e10s][gtk3] Firefox unresponsive spinning cpu when attempting to open select dropdown
Assignee

Updated

3 years ago
Component: Widget → Widget: Gtk
running the ubuntu distribution with rr is not allowing me to reproduce.  rr consumes 200% of the cpu and firefox is really really slow :(  I do have a local debug build that I tried with rr and it seems to have exhibited similar behaviours except I am not sure I reproduced it given that it was so slow it was hard to tell normal from hanging.  I did wait 2 minutes and nothing happened, so I assume I have a reproduction of this.

:karlt, ping me and we can look into this on Monday.
Flags: needinfo?(jmaher) → needinfo?(karlt)
Assignee

Comment 42

3 years ago
We didn't have any luck with rr, but this looks like
https://bugzilla.gnome.org/show_bug.cgi?id=742636

Fixed in
https://git.gnome.org/browse/gtk+/log/gdk/gdkwindow.c?h=3.14.8
Not in
https://git.gnome.org/browse/gtk+/log/gdk/gdkwindow.c?h=3.14.7

-85956736[7f0ff98386e0]: Button 1 press on 7f0fd1c97000
-85956736[7f0ff98386e0]: nsWindow [7f0fb4f11800]
-85956736[7f0ff98386e0]:         mShell 7f0fb5741210 mContainer 7f0fb4ed6b90 mGdkWindow 7f0fb57bea90 0x4c01a7b
-85956736[7f0ff98386e0]: CaptureRollupEvents 7f0fb4f11800 1
-85956736[7f0ff98386e0]: GrabPointer time=0x4752365d retry=0
-85956736[7f0ff98386e0]: GrabPointer: window not visible
-85956736[7f0ff98386e0]: nsWindow::NativeMoveResize [7f0fb4f11800] 262 206 147 114
-85956736[7f0ff98386e0]: nsWindow::Show [7f0fb4f11800] state 1
-85956736[7f0ff98386e0]: size_allocate [7f0fb4f11800] 0 0 147 114
-85956736[7f0ff98386e0]: GrabPointer time=0x4752365d retry=1
-85956736[7f0ff98386e0]: GrabPointer: window not visible
-85956736[7f0ff98386e0]: nsWindow::OnWindowStateEvent [7f0fb4f11800] changed 129 new_window_state 128
-85956736[7f0ff98386e0]: nsWindow::OnWindowStateEvent [7f0fb4f11800] changed 129 new_window_state 128
-85956736[7f0ff98386e0]: GrabPointer time=0x4752365d retry=1
-85956736[7f0ff98386e0]: configure event [7f0fb4f11800] 262 206 147 114
-85956736[7f0ff98386e0]: GetScreenBounds 262,206 | 147x114
-85956736[7f0ff98386e0]: configure event [7f0fb4f11800] 262 206 147 114
-85956736[7f0ff98386e0]: GetScreenBounds 262,206 | 147x114
-85956736[7f0ff98386e0]: Button 1 release on 7f0fd1c97000
-85956736[7f0ff98386e0]: Button 1 press on 7f0fb4f11800
-85956736[7f0ff98386e0]: Button 1 release on 7f0fb4f11800
-85956736[7f0ff98386e0]: CaptureRollupEvents 7f0fb4f11800 0
-85956736[7f0ff98386e0]: ReleaseGrabs
-85956736[7f0ff98386e0]: nsWindow::Destroy [7f0fb4f11800]
-85956736[7f0ff98386e0]: nsWindow::~nsWindow() [7f0fb4f11800]
-85956736[7f0ff98386e0]: GetScreenBounds 1366,-744 | 1235x548
-85956736[7f0ff98386e0]: GetScreenBounds 1366,-744 | 1235x548
-85956736[7f0ff98386e0]: GetScreenBounds 1366,-744 | 1235x548
-85956736[7f0ff98386e0]: GetScreenBounds 1366,-744 | 1235x548

I wrote a monkey patch at
https://treeherder.mozilla.org/#/jobs?repo=try&revision=3afbad9fbfea

Are you able to test this build for me, please Joel?
Assignee: nobody → karlt
Status: NEW → ASSIGNED
Flags: needinfo?(karlt) → needinfo?(jmaher)
Summary: [e10s][gtk3] Firefox unresponsive spinning cpu when attempting to open select dropdown → [e10s][gtk3 versions < 3.14.8] Firefox unresponsive spinning cpu when attempting to open select dropdown
using the build from try I could reproduce the problem:
root@jmaher-ThinkPad-X230:~/karlt# ./firefox/firefox

(process:22398): GLib-CRITICAL **: g_path_get_basename: assertion 'file_name != NULL' failed
^C[Child 22398] ###!!! ABORT: Aborting on channel error.: file /builds/slave/try-l64-0000000000000000000000/build/src/ipc/glue/MessageChannel.cpp, line 1824
[Child 22398] ###!!! ABORT: Aborting on channel error.: file /builds/slave/try-l64-0000000000000000000000/build/src/ipc/glue/MessageChannel.cpp, line 1824

root@jmaher-ThinkPad-X230:~/karlt# 

-------------------------------------------------
running this again, as |NSPR_LOG_MODULES=Widget:5,WidgetFocus:5 ./firefox/firefox|, I reproduce it as well, and have a lot of data near the end:
https://pastebin.mozilla.org/8860768
Flags: needinfo?(jmaher)
Assignee

Comment 44

3 years ago
Sorry, Joel.  My patch code was running too late to work that way.
This way works better in my testing here.

I've also added some logging to help out if it still doesn't work.
If with NSPR_LOG_MODULES=Widget:4 you see
/GdkFrameClock [0-9a-f]+ disposed with pending resume/
then it means at least most of the patch is working as intended.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=6496543d82c7
I beat on this latest build for a while and couldn't get it to hang- quite possibly we have a fix for this!
Assignee

Updated

3 years ago
Blocks: 1204406
Assignee

Comment 48

3 years ago
Thanks again, Joel.
Summary: [e10s][gtk3 versions < 3.14.8] Firefox unresponsive spinning cpu when attempting to open select dropdown → [e10s][3.8 <= gtk version < 3.14.8] Firefox unresponsive spinning cpu when attempting to open select dropdown
Assignee

Comment 49

3 years ago
https://reviewboard.mozilla.org/r/36739/#review33291

This is similar to the original fix presented at
https://bugzilla.gnome.org/show_bug.cgi?id=742636#c1

The fix that eventually landed differed for reasons given in
https://bugzilla.gnome.org/show_bug.cgi?id=742636#c2
but it is the frame clock not the window that has public events, so this is easier to fix there.
Comment on attachment 8723807 [details]
MozReview Request: bug 1199602 emit resume-events on GdkFrameClock if flush/resume not balanced at dispose r?acomminos

https://reviewboard.mozilla.org/r/36739/#review33293

LGTM!
Attachment #8723807 - Flags: review?(andrew) → review+
Comment on attachment 8723806 [details]
MozReview Request: bug 1199602 give existing wrap_gtk_window_check_resize internal linkage r?acomminos

https://reviewboard.mozilla.org/r/36737/#review33295
Attachment #8723806 - Flags: review?(andrew) → review+

Comment 53

3 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/307fa3883609
https://hg.mozilla.org/mozilla-central/rev/f2f4c8ce77a3
Status: ASSIGNED → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
Assignee

Updated

3 years ago
Duplicate of this bug: 1204406
Assignee

Comment 55

3 years ago
Comment on attachment 8723807 [details]
MozReview Request: bug 1199602 emit resume-events on GdkFrameClock if flush/resume not balanced at dispose r?acomminos

Approval Request Comment
[Feature/regressing bug #]: regression already fixed in GTK3 libraries, but affects Firefox GTK3 builds (bug 1186229) on systems with older libraries.
[User impact if declined]: browser can become unusable.
Bug 1204406 indicates that it also causes problems when e10s is disabled.
[Describe test coverage new/current, TreeHerder]: on m-c, but no new test coverage as test infra doesn't include these versions.
[Risks and why]: Some risk of the unexpected, limited to systems with affected GTK3 libraries.
[String/UUID change made/needed]: none.
Attachment #8723807 - Flags: approval-mozilla-aurora?
Comment on attachment 8723807 [details]
MozReview Request: bug 1199602 emit resume-events on GdkFrameClock if flush/resume not balanced at dispose r?acomminos

Needed for GTK3 with older libraries, please uplift to aurora.
Attachment #8723807 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.