Closed Bug 692545 Opened 13 years ago Closed 13 years ago

Firefox hangs trying to add inotify watch to disconnected autofs mount

Categories

(Toolkit :: Downloads API, defect)

7 Branch
x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED INVALID

People

(Reporter: bugzilla.moz, Unassigned)

Details

Attachments

(5 files)

After saving a file to a network mount which is controlled by autofs, if that mount is then unmounted and unable to be mounted again (network disconnect), firefox will hang when it tries to add an inotify watch to the directory which the file was saved. The hang will only end when autofs returns from the mount attempt.

If the browser is restarted, it does not attempt to add inotify watches and operates normally.


Exact steps I used to duplicate:
Brand new firefox profile (no plugins, history, anything).
Created autofs mount "/mnt/test -fstype=cifs,credentials=/root/.cifs.creds ://192.168.0.3/share" and reloaded autofs
In firefox I saved the "Google" logo image from the "Mozilla Firefox Start Page" to /mnt/test
I then unmounted /mnt/test and added a firewall rule to drop all traffic to 192.168.0.3 to simulate network disconnect.
Within a few seconds, firefox hung trying to add an inotify watch (there were no open files on this mount)


Attached is an strace from immediately after the unmount and firewall block til when firefox hung
Also forgot to add, running firefox 7.0.1
Component: General → Download Manager
Product: Firefox → Toolkit
QA Contact: general → download.manager
Attached file automount.log
i think i oughta at least initially latch on to this one with my own autofs gripe, since they might be related fix wise.

my problem is if you do anything in firefox that involves an autofs-controlled mountpoint (download a file to, upload a file from, etc), that mountpoint will no longer stay expired, but instead firefox makes autofs immediately remount it all the time, even though there is no actual user activity happening.

key lines from automount.log:

Oct 10 19:25:08 travelmate automount[10535]: handle_packet_missing_indirect: token 1882, name server, request pid 7509

$ ps aux | grep 7509
leho      7509  4.4 14.4 986448 551760 ?       Sl   03:22  42:30    /usr/bin/firefox -P firefox7

comments?
@Leho Kraav
Yes, the issues are related. In your case the mount is still accessible, so firefox doesnt hang, it just causes it to be remounted.

I'm curious why firefox even needs inotify watches on directories in which a file has been downloaded to. This behavior started for me after I upgraded from firefox 3, so its something new. But downloads worked just fine without whatever this inotify thing is doing. Plus since the mount point is a network share, inotify is kinda useless on it anyway. :-/
Hmm. The only hit I get for "inotify_add_watch" in our source tree is from the Chromium IPC code we use in Electrolysis, which  don't think should be involved here (unless plugins are somehow involved in this?). Probably some intermediate library calling it.

A call stack (gdb bt) of Firefox when hung would be useful. You'll probably need to do this with a debug build to get useful symbols. EG one of the builds from:

 http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2011-10-10-mozilla-central-debug/
Here's the backtrace. I dont have glib compiled with debugging symbols, so some of the locations are unknown. I can recompile it if needed.


#0  0x00007fba05540557 in inotify_add_watch () from /lib64/libc.so.6
#1  0x00007fb9fdc7962c in ?? () from /usr/lib64/libgio-2.0.so.0
#2  0x00007fb9fdc7a4c2 in ?? () from /usr/lib64/libgio-2.0.so.0
#3  0x00007fb9fdc7aa59 in ?? () from /usr/lib64/libgio-2.0.so.0
#4  0x00007fb9fe7509fb in ?? () from /usr/lib64/libglib-2.0.so.0
#5  0x00007fb9fe74ffa1 in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0
#6  0x00007fb9fe754638 in ?? () from /usr/lib64/libglib-2.0.so.0
#7  0x00007fb9fe7547ec in g_main_context_iteration () from /usr/lib64/libglib-2.0.so.0
#8  0x00007fba03b9a327 in nsBaseAppShell::DoProcessNextNativeEvent (this=0x28, mayWait=-669521248)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/widget/src/xpwidgets/nsBaseAppShell.cpp:171
#9  0x00007fba03b9a567 in nsBaseAppShell::OnProcessNextEvent (this=0x7fb9eae2b2b0, 
    thr=0x7fb9f0b303d0, mayWait=1, recursionDepth=<value optimized out>)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/widget/src/xpwidgets/nsBaseAppShell.cpp:324
#10 0x00007fba03cb98ce in nsThread::ProcessNextEvent (this=0x7fb9f0b303d0, mayWait=1, 
    result=0x7fffdb323eac)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/xpcom/threads/nsThread.cpp:581
#11 0x00007fba03c8cb77 in NS_ProcessNextEvent_P (thread=0x28, mayWait=-669521248)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/obj-x86_64-unknown-linux-gnu/xpcom/build/nsThreadUtils.cpp:245
#12 0x00007fba03c28b32 in mozilla::ipc::MessagePump::Run (this=0x7fb9f0b19e00, 
    aDelegate=0x7fb9f0b1b240)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/ipc/glue/MessagePump.cpp:134
#13 0x00007fba03cd49c8 in RunHandler (this=0x28)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/ipc/chromium/src/base/message_loop.cc:202
#14 MessageLoop::Run (this=0x28)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/ipc/chromium/src/base/message_loop.cc:176
#15 0x00007fba03b9a683 in nsBaseAppShell::Run (this=0x7fb9eae2b2b0)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/widget/src/xpwidgets/nsBaseAppShell.cpp:189
#16 0x00007fba03a5f72a in nsAppStartup::Run (this=0x7fb9e9710d40)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/toolkit/components/startup/nsAppStartup.cpp:222
#17 0x00007fba032d04dd in XRE_main (argc=<value optimized out>, argv=<value optimized out>, 
    aAppData=<value optimized out>)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/toolkit/xre/nsAppRunner.cpp:3570
#18 0x00000000004020c3 in do_main (argc=3, argv=0x7fffdb3269b8)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/browser/app/nsBrowserApp.cpp:198
#19 main (argc=3, argv=0x7fffdb3269b8)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/browser/app/nsBrowserApp.cpp:281
Hmm. Not really sure what's going on. We're processing a native event, and something in glib/gio decided it wants to call inotify. Perhaps symbols for the top 5 frames (via -devel libs?) would give a clue?

CC'ing in a few people who might have some idea about this. I've no idea what might cause this -- best wild guess would be us trying to set an icon (moz-icon:) in the download manager window?
The hang sounds like an issue somewhere between GIO and autofs rather than a Firefox-specific issue.
ps -o wchan -p <firefox-pid> should give a hint of which kernel function is blocking.

I don't know why Firefox might watch the download directory.
A stack trace with libgio symbols may provide some clues.
(In reply to Karl Tomlinson (:karlt) from comment #7)
Well its not firefox itself hanging directly, but rather glib yes. But the question is what is firefox doing that makes glib want to put an inotify watch on a directory a file was downloaded to?

Here's the backtrace with glib debugging symbols

#0  0x00007f7611067557 in inotify_add_watch () from /lib64/libc.so.6
#1  0x00007f760977968c in _ik_watch (path=<value optimized out>, mask=<value optimized out>, 
    err=0x7fff0ff2f5cc) at inotify-kernel.c:298
#2  0x00007f760977a522 in _ip_start_watching (sub=0x7f75e191e970) at inotify-path.c:159
#3  0x00007f760977aab9 in im_scan_missing (user_data=<value optimized out>) at inotify-missing.c:118
#4  0x00007f760a25061b in g_timeout_dispatch (source=0x7f75e1178320, callback=0x7f75e15e8d80, 
    user_data=0x1002fce) at gmain.c:3895
#5  0x00007f760a24fbc1 in g_main_dispatch (context=0x7f761092d6e0) at gmain.c:2441
#6  g_main_context_dispatch (context=0x7f761092d6e0) at gmain.c:3014
#7  0x00007f760a254258 in g_main_context_iterate (context=0x7f761092d6e0, block=<value optimized out>, 
    dispatch=<value optimized out>, self=<value optimized out>) at gmain.c:3092
#8  0x00007f760a25440c in g_main_context_iteration (context=0x7f761092d6e0, may_block=1) at gmain.c:3155
#9  0x00007f760f69a327 in nsBaseAppShell::DoProcessNextNativeEvent (this=0x4, mayWait=-513897088)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/widget/src/xpwidgets/nsBaseAppShell.cpp:171
#10 0x00007f760f69a567 in nsBaseAppShell::OnProcessNextEvent (this=0x7f75f692b2b0, thr=0x7f75fc6303d0, 
    mayWait=1, recursionDepth=<value optimized out>)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/widget/src/xpwidgets/nsBaseAppShell.cpp:324
#11 0x00007f760f7b98ce in nsThread::ProcessNextEvent (this=0x7f75fc6303d0, mayWait=1, result=0x7fff0ff2f89c)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/xpcom/threads/nsThread.cpp:581
#12 0x00007f760f78cb77 in NS_ProcessNextEvent_P (thread=0x4, mayWait=-513897088)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/obj-x86_64-unknown-linux-gnu/xpcom/build/nsThreadUtils.cpp:245
#13 0x00007f760f728b32 in mozilla::ipc::MessagePump::Run (this=0x7f75fc619e00, aDelegate=0x7f75fc61b240)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/ipc/glue/MessagePump.cpp:134
#14 0x00007f760f7d49c8 in RunHandler (this=0x4)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/ipc/chromium/src/base/message_loop.cc:202
#15 MessageLoop::Run (this=0x4)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/ipc/chromium/src/base/message_loop.cc:176
#16 0x00007f760f69a683 in nsBaseAppShell::Run (this=0x7f75f692b2b0)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/widget/src/xpwidgets/nsBaseAppShell.cpp:189
#17 0x00007f760f55f72a in nsAppStartup::Run (this=0x7f75f520fd40)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/toolkit/components/startup/nsAppStartup.cpp:222
#18 0x00007f760edd04dd in XRE_main (argc=<value optimized out>, argv=<value optimized out>, 
    aAppData=<value optimized out>)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/toolkit/xre/nsAppRunner.cpp:3570
#19 0x00000000004020c3 in do_main (argc=3, argv=0x7fff0ff323a8)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/browser/app/nsBrowserApp.cpp:198
#20 main (argc=3, argv=0x7fff0ff323a8)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/browser/app/nsBrowserApp.cpp:281
Better stack trace. Rebuilt glib without optimization flags.


#0  0x00007ffff7416557 in inotify_add_watch () from /lib64/libc.so.6
#1  0x00007fffefb33724 in _ik_watch (path=0x7fffcaa3fa80 "/mnt/test", mask=16789454, err=0x7fffffffb530)
    at inotify-kernel.c:298
#2  0x00007fffefb349ab in _ip_start_watching (sub=0x7fffcbef0580) at inotify-path.c:159
#3  0x00007fffefb35653 in im_scan_missing (user_data=0x0) at inotify-missing.c:118
#4  0x00007ffff0626d8f in g_timeout_dispatch (source=0x7fffca82d0b0, 
    callback=0x7fffefb354df <im_scan_missing>, user_data=0x0) at gmain.c:3895
#5  0x00007ffff0623554 in g_main_dispatch (context=0x7ffff6d2d600) at gmain.c:2441
#6  0x00007ffff0624c4b in g_main_context_dispatch (context=0x7ffff6d2d600) at gmain.c:3014
#7  0x00007ffff0625163 in g_main_context_iterate (context=0x7ffff6d2d600, block=1, dispatch=1, 
    self=0x7ffff6d309c0) at gmain.c:3092
#8  0x00007ffff0625381 in g_main_context_iteration (context=0x7ffff6d2d600, may_block=1) at gmain.c:3155
#9  0x00007ffff5a9a327 in nsBaseAppShell::DoProcessNextNativeEvent (this=0x3d, mayWait=-895223168)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/widget/src/xpwidgets/nsBaseAppShell.cpp:171
#10 0x00007ffff5a9a567 in nsBaseAppShell::OnProcessNextEvent (this=0x7fffdcd2b2b0, thr=0x7fffe2a2c480, 
    mayWait=1, recursionDepth=<value optimized out>)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/widget/src/xpwidgets/nsBaseAppShell.cpp:324
#11 0x00007ffff5bb98ce in nsThread::ProcessNextEvent (this=0x7fffe2a2c480, mayWait=1, result=0x7fffffffb81c)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/xpcom/threads/nsThread.cpp:581
#12 0x00007ffff5b8cb77 in NS_ProcessNextEvent_P (thread=0x3d, mayWait=-895223168)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/obj-x86_64-unknown-linux-gnu/xpcom/build/nsThreadUtils.cpp:245
#13 0x00007ffff5b28b32 in mozilla::ipc::MessagePump::Run (this=0x7fffe2a19e00, aDelegate=0x7fffe2a1b240)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/ipc/glue/MessagePump.cpp:134
#14 0x00007ffff5bd49c8 in RunHandler (this=0x3d)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/ipc/chromium/src/base/message_loop.cc:202
#15 MessageLoop::Run (this=0x3d)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/ipc/chromium/src/base/message_loop.cc:176
#16 0x00007ffff5a9a683 in nsBaseAppShell::Run (this=0x7fffdcd2b2b0)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/widget/src/xpwidgets/nsBaseAppShell.cpp:189
#17 0x00007ffff595f72a in nsAppStartup::Run (this=0x7fffdb60fd40)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/toolkit/components/startup/nsAppStartup.cpp:222
#18 0x00007ffff51d04dd in XRE_main (argc=<value optimized out>, argv=<value optimized out>, 
    aAppData=<value optimized out>)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/toolkit/xre/nsAppRunner.cpp:3570
#19 0x00000000004020c3 in do_main (argc=3, argv=0x7fffffffe328)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/browser/app/nsBrowserApp.cpp:198
#20 main (argc=3, argv=0x7fffffffe328)
    at /var/tmp/portage/www-client/firefox-7.0.1-r1/work/mozilla-release/browser/app/nsBrowserApp.cpp:281
Would you be able to set a breakpoint on ip_watched_dir_new and get a stack trace there when path == "/mnt/test", please?

Running "set im_debug_enabled = 1" from gdb can start gio/inotify logging debug info.
Ok, have 2 attachments for you. The first is gdb output from a break on ip_watched_dir_new when I first save the file in firefox.

The second is also a break on ip_watched_dir_new but this one occurs when /mnt/test is unmounted. Those GIO-WARNING messages appear almost exactly when the filesystem is unmounted, and then a few seconds later the break point triggers.
Looks like the g_file_monitor_directory is called from gtk_folder_constructed,
and the GtkFolder is created from somewhere in gtkfilechooserentry.c.
I expect that is so that the directory list can be updated as files are
added or removed.

There is then the question of why gio/inotify still has this folder in
missing_sub_list after the file chooser dialog has closed.

I doubt it is a Firefox issue and I doubt there's anything we can do from
Firefox because gtk_file_chooser_dialog_finalize is getting called as
expected.

Have you tried doing similar operations using the gtk file chooser in other
apps?  Perhaps try saving a copy from evince.
(In reply to Karl Tomlinson (:karlt) from comment #13)
You appear to be correct. I tried saving a copy of a document in evince to the mount point, verified there were no open files on the mount, unmounted, and then firewalled it off, and evince hung. So it would appear this issue is not firefox's doing.
Do you know whether the bug would be in gtk, or in glib? Just want to know where I should be taking this bug report to.

And thanks for your efforts.
I suspect the hang is an issue in autofs or something kernel-side as it should be fine to call inotify_add_watch.

However, there may be an issue with a leak on gio or gtk.  It's also possible that the relevant objects are released on a timer - I don't know.  If there's a leak there, I'd guess GIO first.  The people looking at GIO bugs probably know about GTK too, so they can redirect if appropriate.
Well I think autofs is behaving as expected. Something tries to operate on the mount point, so autofs goes and tries to mount it. Though maybe it is debatable that adding a watch shouldnt trigger a mount.

I'll go take this bug elsewhere, thanks again for the help.
Status: UNCONFIRMED → RESOLVED
Closed: 13 years ago
Resolution: --- → INVALID
Pardon me for not noticing the developments here before. Comment 2 is still a bugger for me in the newest TB + FF 11. One day I realized that the remounting is probably caused by the fact that after you commit an operation on an autofs mount point, the file chooser dialog retains that path as "last used" and that causes the endless remounts to happen.

Problem is I have hooked notify-send from syslog to notify me about mounts. 2 repeating notifications every minute is not cool.

I'm not entirely sure how autofs is able to "safely" unmount, but I guess then it's the periodic check of inotify that leaves a pause where autofs can unmount?

Either way, Patrick did you take this bug anywhere else? I'd really like to comfortably use autofs mountpoints directly from FF/TB (or I guess all apps) and would like to explore what I can do about it. I wonder if this has been in improved in GTK3?
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: