Clicking Save button to close GTK+ 3.x Save dialog probabilistically hangs UI thread

RESOLVED DUPLICATE of bug 1199602

Status

()

Core
Widget: Gtk
RESOLVED DUPLICATE of bug 1199602
2 years ago
2 years ago

People

(Reporter: Stephan Sokolow, Assigned: karlt)

Tracking

({hang})

42 Branch
Unspecified
Linux
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

2 years ago
User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:42.0) Gecko/20100101 Firefox/42.0
Build ID: 20150819080340

Steps to reproduce:

As part of using the browser normally:
1. Be running Firefox Developer Edition on Linux (A Lubuntu 14.04 install which was switched to kubuntu-desktop, in my case)
1. Use some mixture of "Save Page As..." and "Save Image As..." repeatedly


Actual results:

Since the switch to GTK+ 3.x builds, clicking the Save button in the GTK+ Save As... dialog has a low (maybe about 2% on my machine) probability of freezing the browser's UI immediately after the dialog goes away.

(However, the browser's resource usage patterns appear to be identical to if I'd just taken my hand off the mouse after clicking the button, so I can only assume that it's only the UI thread that's freezing and everything else continues to run normally.)

In case it's relevant, this happens with or without desktop compositing.


Expected results:

In GTK+ 2.x builds, this Just Works™ every time, allowing me to attain Firefox uptimes on the order of a week or two before something forces me to reboot my computer. (I just turn off my monitors when it's time for bed, leaving my work as-is for the next morning.)
(Reporter)

Comment 1

2 years ago
I've got some new information. It's not a GUI thread hang... it's just that the GUI stops processing input events.

(It happened with a file large enough that I was able to observe that the download progress indicators update normally while the UI is completely unresponsive to input and the KWin-provided close button ends up displaying the "Do you want to kill this hung application?" dialog.)
(Reporter)

Comment 2

2 years ago
Oh, I also forgot to mention that I managed to trigger it via the Cancel button, so it's Firefox becoming nonresponsive to input on the dismissal of a GTK+ Save dialog, regardless of how it was dismissed.

Updated

2 years ago
Component: Untriaged → Widget: Gtk
Keywords: hang
OS: Unspecified → Linux
Product: Firefox → Core
(Assignee)

Updated

2 years ago
Blocks: 627699
(Reporter)

Comment 3

2 years ago
Another detail discovered: Firefox doesn't visibly respond to remote commands (eg. opening a URL) while frozen like this but it also doesn't display the profile manager after less than a minute as it normally does on my system when the browser is so bogged down that the request to open a new URL times out (which can easily happen when I middle-click a lot of Amazon.ca links in rapid succession).

Instead, the appearance of the profile manager is deferred until Firefox is killed, no matter how long that may take. (I had to switch to Chrome to do other things and wait 13 minutes for a large download to complete before I could kill Firefox to recover access to my existing tabs. The profile manager waited until then to pop up in response to the remote command.)
Status: UNCONFIRMED → NEW
Ever confirmed: true
(Reporter)

Comment 4

2 years ago
Ooh, interesting. It happened while compositing was turned off and that revealed that, not only is it non-responsive to input events, it's also non-responsive to X11 damage notifications, which, I suspect, means that it's non-responsive to X11 events in general.

(ie. Despite updating internally-sourced stuff like GIF favicons and download the download progress normally, it's possible to smear another window over the static portions of the window when compositing is turned off.)
(Assignee)

Comment 5

2 years ago
Is e10s enabled?
i.e. is "Enable multi-process" in Preferences -> General ticked?
If so, does disabling that avoid the problem?
(Reporter)

Comment 6

2 years ago
No, I already have e10s disabled.

In fact, I disabled it specifically to see if doing so helped this issue (it doesn't) and kept it that way because various extensions I run (eg. NoScript) weren't fully compatible with e10s.
(Assignee)

Comment 7

2 years ago
(In reply to Stephan Sokolow from comment #0)
> (However, the browser's resource usage patterns appear to be identical to if
> I'd just taken my hand off the mouse after clicking the button, so I can
> only assume that it's only the UI thread that's freezing and everything else
> continues to run normally.)

Is the CPU busy?

Does "top -H" identify any busy threads?

Can you run "strace -p <firefox-pid>" please and look for a pattern?

What GTK3 version is installed?
"ls -l /usr/lib64/libgtk-3.so.0" or similar if in a different location will provide this info.
(Reporter)

Comment 8

2 years ago
I just checked and it does seem to be busy-waiting, since it has a tendency to sit at over 95% CPU while frozen (top -H still just calls the thread in question "firefox")... but it'll sometimes drop as low as 3% CPU and stay that way for 5-10 seconds.

As for a pattern, it seems to repeat this block of calls initially, bounded only by the speed limit imposed by writing to the terminal...

poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=33, events=POLLIN}, {fd=3, events=POLLIN}, {fd=32, events=POLLIN}, {fd=79, events=POLLIN}, {fd=68, events=POLLIN}], 7, 0) = 0 (Timeout)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=33, events=POLLIN}, {fd=3, events=POLLIN}, {fd=32, events=POLLIN}, {fd=79, events=POLLIN}, {fd=68, events=POLLIN}], 7, 0) = 0 (Timeout)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=33, events=POLLIN}, {fd=3, events=POLLIN}, {fd=32, events=POLLIN}, {fd=79, events=POLLIN}, {fd=68, events=POLLIN}], 7, 0) = 0 (Timeout)
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=33, events=POLLIN}, {fd=3, events=POLLIN}, {fd=32, events=POLLIN}, {fd=79, events=POLLIN}, {fd=68, events=POLLIN}], 7, 4294967295) = 1 ([{fd=33, revents=POL
LIN}])
read(33, "\372", 1)                     = 1

...and then, possibly in part because urxvt and GNU screen are now consuming 80% of the CPU, it eventually switches to this instead:

poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=33, events=POLLIN}, {fd=3, events=POLLIN}, {fd=32, events=POLLIN}, {fd=79, events=POLLIN}, {fd=68, events=POLLIN}], 7, 0) = 1 ([{fd=4, revents=POLLIN}])
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=33, events=POLLIN}, {fd=3, events=POLLIN}, {fd=32, events=POLLIN}, {fd=79, events=POLLIN}, {fd=68, events=POLLIN}], 7, 0) = 1 ([{fd=4, revents=POLLIN}])
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=33, events=POLLIN}, {fd=3, events=POLLIN}, {fd=32, events=POLLIN}, {fd=79, events=POLLIN}, {fd=68, events=POLLIN}], 7, 0) = 1 ([{fd=4, revents=POLLIN}])
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=33, events=POLLIN}, {fd=3, events=POLLIN}, {fd=32, events=POLLIN}, {fd=79, events=POLLIN}, {fd=68, events=POLLIN}], 7, 4294967295) = 1 ([{fd=4, revents=POLLIN}])
write(34, "\372", 1)                    = 1
poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=33, events=POLLIN}, {fd=3, events=POLLIN}, {fd=32, events=POLLIN}, {fd=79, events=POLLIN}, {fd=68, events=POLLIN}], 7, 0) = 2 ([{fd=4, revents=POLLIN}, {fd=33, revents=POLLIN}])
read(33, "\372", 1)                     = 1

As for the GTK+ 3.x version, it's "libgtk-3.so.0.1000.8". (If it helps, I'm running an up-to-date Lubuntu 14.04 LTS install)
(Assignee)

Comment 9

2 years ago
(In reply to Stephan Sokolow from comment #8)
> poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=33, events=POLLIN},
> {fd=3, events=POLLIN}, {fd=32, events=POLLIN}, {fd=79, events=POLLIN},
> {fd=68, events=POLLIN}], 7, 0) = 1 ([{fd=4, revents=POLLIN}])
> poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=33, events=POLLIN},
> {fd=3, events=POLLIN}, {fd=32, events=POLLIN}, {fd=79, events=POLLIN},
> {fd=68, events=POLLIN}], 7, 0) = 1 ([{fd=4, revents=POLLIN}])
> poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=33, events=POLLIN},
> {fd=3, events=POLLIN}, {fd=32, events=POLLIN}, {fd=79, events=POLLIN},
> {fd=68, events=POLLIN}], 7, 0) = 1 ([{fd=4, revents=POLLIN}])
> poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=33, events=POLLIN},
> {fd=3, events=POLLIN}, {fd=32, events=POLLIN}, {fd=79, events=POLLIN},
> {fd=68, events=POLLIN}], 7, 4294967295) = 1 ([{fd=4, revents=POLLIN}])
> write(34, "\372", 1)                    = 1
> poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}, {fd=33, events=POLLIN},
> {fd=3, events=POLLIN}, {fd=32, events=POLLIN}, {fd=79, events=POLLIN},
> {fd=68, events=POLLIN}], 7, 0) = 2 ([{fd=4, revents=POLLIN}, {fd=33,
> revents=POLLIN}])
> read(33, "\372", 1)                     = 1

Thanks.  This poll returning fd 4 ready for read, but no read happening is consistent with https://bugzilla.gnome.org/show_bug.cgi?id=742636

> As for the GTK+ 3.x version, it's "libgtk-3.so.0.1000.8".

As is this version.

You may like to try the build here, but it is based on Nightly and so a separate profile is best:
http://archive.mozilla.org/pub/firefox/try-builds/ktomlinson@mozilla.com-6496543d82c74648215d98fbe8fa606e89668837/try-linux64/
Assignee: nobody → karlt
Depends on: 1199602
(Reporter)

Comment 10

2 years ago
> You may like to try the build here, but it is based on Nightly and so a
> separate profile is best:
> http://archive.mozilla.org/pub/firefox/try-builds/ktomlinson@mozilla.com-
> 6496543d82c74648215d98fbe8fa606e89668837/try-linux64/

Thanks, but my schedule's a mess right now, so I'm not sure if or when I'll have time to install and run a profile purely for testing that. (All of the testing I've been doing so far has just been examining the aftermath when it freezes up under normal use)

However, knowing where the problem lies, maybe I can find a backport of a newer GTK+ 3.x. Failing that, we are coming up on *buntu 16.04 which, if I remember correctly, is the next LTS release.
(Assignee)

Comment 11

2 years ago
OK.  A workaround landed in bug 1199602, and I'll try to get that on affected branches.  I'll dupe this there as it seems to be the same cause.
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1199602
You need to log in before you can comment on or make changes to this bug.