Closed Bug 566208 Opened 10 years ago Closed 8 years ago

Dying Firefox processes sometimes get stuck [@ free | nsProfileLock::Unlock]

Categories

(Core :: General, defect, critical)

x86_64
Linux
defect
Not set
critical

Tracking

()

RESOLVED DUPLICATE of bug 522332
Tracking Status
blocking2.0 --- -
status2.0 --- wanted

People

(Reporter: jruderman, Unassigned)

References

Details

(Keywords: hang)

Attachments

(1 file)

Attached file stack trace from gdb
My fuzzing processes on Linux often get stuck after automation.py sends them a SIGABRT.  System Monitor reports the process as being in the "futex_wait_queue_me" waiting channel, and gdb gives this stack trace if I attach to the firefox-bin process.  They stay stuck forever (at least 6 hours).

I'm using a debug build based on mozilla-central on Ubuntu 10.04 (64-bit).  I have breakpad enabled.

Is this due to bug 522332?
(In reply to comment #0)
> Is this due to bug 522332?

Yes.

#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x00007f45f4de760f in _L_lock_1172 () from /lib/libpthread.so.0
#2  0x00007f45f4de755a in __pthread_mutex_lock (mutex=0x7f45f50ea048) at pthread_mutex_lock.c:101
#3  0x0000000000402954 in malloc_mutex_lock (mutex=0x7f45f50ea048) at /home/jruderman/mozilla-central/memory/jemalloc/jemalloc.c:1368
#4  0x00000000004149e8 in arena_dalloc (arena=0x7f45f50ea040, chunk=0x7f45e3300000, ptr=0x7f45e332bea0)
    at /home/jruderman/mozilla-central/memory/jemalloc/jemalloc.c:4226
#5  0x0000000000414b08 in idalloc (ptr=0x7f45e332bea0) at /home/jruderman/mozilla-central/memory/jemalloc/jemalloc.c:4243
#6  0x0000000000418101 in free (ptr=0x7f45e332bea0) at /home/jruderman/mozilla-central/memory/jemalloc/jemalloc.c:6017
#7  0x00007f45f19d1004 in moz_free (ptr=0x7f45e332bea0) at /home/jruderman/mozilla-central/memory/mozalloc/mozalloc.cpp:81

The signal handler is trying to acquire the same arena->lock ...

#8  0x00007f45f28cb3e3 in nsProfileLock::Unlock (this=0x7f45e89fd9d0) at nsProfileLock.cpp:678
#9  0x00007f45f28ca397 in nsProfileLock::RemovePidLockFiles () at nsProfileLock.cpp:150
#10 0x00007f45f28ca3d7 in nsProfileLock::FatalSignalHandler (signo=6, info=0x7fff68ee92b0, context=0x7fff68ee9180) at nsProfileLock.cpp:166
#11 <signal handler called>
#12 0x0000000000414185 in arena_dalloc_small (arena=0x7f45f50ea040, chunk=0x7f45d2400000, ptr=0x7f45d24406c0, mapelm=0x7f45d2400620)
    at /home/jruderman/mozilla-central/memory/jemalloc/jemalloc.c:4097
#13 0x0000000000414a03 in arena_dalloc (arena=0x7f45f50ea040, chunk=0x7f45d2400000, ptr=0x7f45d24406c0)
    at /home/jruderman/mozilla-central/memory/jemalloc/jemalloc.c:4227

... that is currently held, but won't be released because the signal handler is
blocking.

Signal handlers must not wait for locks that might already be held by the same
thread.
The specific solution being here not to free memory from within the signal handler.
This bug forces me to babysit my Linux fuzzing machine.  I'd be able to fuzz more effectively if this bug were fixed.
blocking2.0: --- → ?
While it is modestly reassuring that bug 522332 is the only known cause here, it would make sense for automation.py to send SIGKILL if the process hasn't terminated some interval after sending the SIGABRT.
I don't see us blocking, but def wanted 1.9.3+
blocking2.0: ? → -
status2.0: --- → wanted
Duplicate of this bug: 582952
This should be fixed since bug 522332 was fixed.
If we want automation.py to send SIGKILL, then that can be a separate bug.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 522332
You need to log in before you can comment on or make changes to this bug.