Closed Bug 974004 Opened 12 years ago Closed 7 years ago

Investigate spike in Linux mochitest shutdown hangs beginning ~10-Feb

Categories

(Core :: Audio/Video: Playback, defect)

All
Linux
defect
Not set
normal

Tracking

()

RESOLVED INACTIVE

People

(Reporter: RyanVM, Unassigned)

References

Details

Beginning around 10-February, we've seen a spike in Linux mochitest shutdown hangs. Not sure if the media stuff on the stack is relevant. Logs look like this: https://tbpl.mozilla.org/php/getParsedLog.php?id=34506820&tree=Mozilla-Inbound Ubuntu VM 12.04 x64 mozilla-inbound debug test mochitest-1 on 2014-02-11 11:32:30 PST for push b6200f26d4fc slave: tst-linux64-ec2-029 12:49:04 WARNING - TEST-UNEXPECTED-FAIL | Shutdown | application timed out after 330 seconds with no output 12:49:05 INFO - Xlib: extension "RANDR" missing on display ":0". 12:49:07 WARNING - TEST-UNEXPECTED-FAIL | Shutdown | application terminated with exit code 6 12:49:07 INFO - INFO | runtests.py | Application ran for: 1:12:43.060513 12:49:07 INFO - INFO | zombiecheck | Reading PID log: /tmp/tmpO5MeSUpidlog 12:49:07 INFO - ==> process 2362 launched child process 2465 12:49:07 INFO - ==> process 2362 launched child process 2474 12:49:07 INFO - ==> process 2362 launched child process 3811 12:49:07 INFO - INFO | zombiecheck | Checking for orphan process with PID: 2465 12:49:07 INFO - INFO | zombiecheck | Checking for orphan process with PID: 2474 12:49:07 INFO - INFO | zombiecheck | Checking for orphan process with PID: 3811 12:49:17 WARNING - PROCESS-CRASH | Shutdown | application crashed [@ libpthread-2.15.so + 0xbd84] 12:49:17 INFO - Crash dump filename: /tmp/tmph4y85_/minidumps/6fe47f9b-a3b8-f8de-2f4202cc-66156387.dmp 12:49:17 INFO - Operating system: Linux 12:49:17 INFO - 0.0.0 Linux 3.2.0-23-generic #36-Ubuntu SMP Tue Apr 10 20:39:51 UTC 2012 x86_64 12:49:17 INFO - CPU: amd64 12:49:17 INFO - family 6 model 62 stepping 4 12:49:17 INFO - 1 CPU 12:49:17 INFO - Crash reason: SIGABRT 12:49:17 INFO - Crash address: 0x3e800000910 12:49:17 INFO - Thread 0 (crashed) 12:49:17 INFO - 0 libpthread-2.15.so + 0xbd84 12:49:17 INFO - rbx = 0x0000000001c62bd0 r12 = 0x0000000001c62be0 12:49:17 INFO - r13 = 0x0000000001c62c10 r14 = 0x00000000ffffffff 12:49:17 INFO - r15 = 0x00007f8757047740 rip = 0x00007f8756c33d84 12:49:17 INFO - rsp = 0x00007fff689ac470 rbp = 0x00007fff689ac4f0 12:49:17 INFO - Found by: given as instruction pointer in context 12:49:17 INFO - 1 libnspr4.so!PR_Wait [ptsynch.c:b6200f26d4fc : 689 + 0xb] 12:49:17 INFO - rip = 0x00007f8755a2dabe rsp = 0x00007fff689ac4a0 12:49:17 INFO - rbp = 0x00007fff689ac4f0 12:49:17 INFO - Found by: stack scanning 12:49:17 INFO - 2 libxul.so!mozilla::ReentrantMonitor::Wait(unsigned int) [BlockingResourceBase.cpp:b6200f26d4fc : 309 + 0x4] 12:49:17 INFO - rbx = 0x0000000001c628e0 r12 = 0x0000000000000000 12:49:17 INFO - r13 = 0x0000000001bce9f0 r14 = 0x0000000000000001 12:49:17 INFO - r15 = 0x0000000000000001 rip = 0x00007f874fda4764 12:49:17 INFO - rsp = 0x00007fff689ac500 rbp = 0x00007fff689ac530 12:49:17 INFO - Found by: call frame info 12:49:17 INFO - 3 libxul.so!nsEventQueue::GetEvent(bool, nsIRunnable**) [ReentrantMonitor.h:b6200f26d4fc : 192 + 0xb] 12:49:17 INFO - rbx = 0x0000000001c628e0 r12 = 0x00007fff689ac598 12:49:17 INFO - r13 = 0x00007fff689ac601 r14 = 0x00000000048d7f01 12:49:17 INFO - r15 = 0x0000000000000001 rip = 0x00007f874fe09e27 12:49:17 INFO - rsp = 0x00007fff689ac540 rbp = 0x00007fff689ac570 12:49:17 INFO - Found by: call frame info 12:49:17 INFO - 4 libxul.so!nsThread::ProcessNextEvent(bool, bool*) [nsThread.h:b6200f26d4fc : 97 + 0x14] 12:49:17 INFO - rbx = 0x0000000001c62860 r12 = 0x00000000080f7b01 12:49:17 INFO - r13 = 0x00007fff689ac60f r14 = 0x00000000048d7f01 12:49:17 INFO - r15 = 0x0000000000000001 rip = 0x00007f874fe0efcb 12:49:17 INFO - rsp = 0x00007fff689ac580 rbp = 0x00007fff689ac5f0 12:49:17 INFO - Found by: call frame info 12:49:17 INFO - 5 libxul.so!mozilla::StateMachineThread::SpinUntilShutdownComplete() [MediaShutdownManager.cpp:b6200f26d4fc : 72 + 0x16] 12:49:17 INFO - rbx = 0x0000000008b27420 r12 = 0x00000000080f7b70 12:49:17 INFO - r13 = 0x00007f8750ca65f2 r14 = 0x00000000048d7ff8 12:49:17 INFO - r15 = 0x0000000000000010 rip = 0x00007f8750ca94ce 12:49:17 INFO - rsp = 0x00007fff689ac600 rbp = 0x00007fff689ac620 12:49:17 INFO - Found by: call frame info 12:49:17 INFO - 6 libxul.so!mozilla::JoinStateMachineThreads [MediaShutdownManager.cpp:b6200f26d4fc : 201 + 0x7] 12:49:17 INFO - rbx = 0x0000000008b27420 r12 = 0x00000000080f7b70 12:49:17 INFO - r13 = 0x00007f8750ca65f2 r14 = 0x00000000048d7ff8 12:49:17 INFO - r15 = 0x0000000000000010 rip = 0x00007f8750caa68c 12:49:17 INFO - rsp = 0x00007fff689ac630 rbp = 0x00007fff689ac640 12:49:17 INFO - Found by: call frame info 12:49:17 INFO - 7 libxul.so!PL_DHashTableEnumerate [pldhash.cpp:b6200f26d4fc : 632 + 0x16] 12:49:17 INFO - rbx = 0x0000000000000010 r12 = 0x00000000080f7b70 12:49:17 INFO - r13 = 0x00007f8750ca65f2 r14 = 0x00000000048d7ff8 12:49:17 INFO - r15 = 0x0000000000000010 rip = 0x00007f874fdacbc3 12:49:17 INFO - rsp = 0x00007fff689ac650 rbp = 0x00007fff689ac6a0 12:49:17 INFO - Found by: call frame info 12:49:17 INFO - 8 libxul.so!mozilla::MediaShutdownManager::Shutdown() [nsTHashtable.h:b6200f26d4fc : 230 + 0x4] 12:49:17 INFO - rbx = 0x00000000048d7fb0 r12 = 0x00007f8751dfa383 12:49:17 INFO - r13 = 0x0000000001c8c258 r14 = 0x00007f8751dfa383 12:49:17 INFO - r15 = 0x0000000000000000 rip = 0x00007f8750cb3892 12:49:17 INFO - rsp = 0x00007fff689ac6b0 rbp = 0x00007fff689ac6d0 12:49:17 INFO - Found by: call frame info 12:49:17 INFO - 9 libxul.so!mozilla::MediaShutdownManager::Observe(nsISupports*, char const*, char16_t const*) [MediaShutdownManager.cpp:b6200f26d4fc : 160 + 0x7] 12:49:17 INFO - rbx = 0x00000000048d7fb0 r12 = 0x00007f8751dfa383 12:49:17 INFO - r13 = 0x0000000001c8c258 r14 = 0x00007f8751dfa383 12:49:17 INFO - r15 = 0x0000000000000000 rip = 0x00007f8750cb3915 12:49:17 INFO - rsp = 0x00007fff689ac6e0 rbp = 0x00007fff689ac6f0 12:49:17 INFO - Found by: call frame info 12:49:17 INFO - 10 libxul.so!nsObserverList::NotifyObservers(nsISupports*, char const*, char16_t const*) [nsObserverList.cpp:b6200f26d4fc : 96 + 0xe] 12:49:17 INFO - rbx = 0x0000000000000024 r12 = 0x00007fff689ac708 12:49:17 INFO - r13 = 0x0000000001c8c258 r14 = 0x00007f8751dfa383 12:49:17 INFO - r15 = 0x0000000000000000 rip = 0x00007f874fde6888 12:49:17 INFO - rsp = 0x00007fff689ac700 rbp = 0x00007fff689ac740 12:49:17 INFO - Found by: call frame info
These are currently the #1 orange on trunk.
Roc/et al: This appears to be a hang in mozilla::MediaShutdownManager::Shutdown(). Can you take a look and/or correlate against media checkins that affect threading/locking in that timeframe? Thanks
Flags: needinfo?(roc)
Related to bug 961197 (just a guess)?
Maybe fixed by bug 949346, bug 973379, or bug 973139. Or something else. There's active work in those bugs right now to fix issues like this. Ryan: When you see shutdown waiting in MediaShutdownManager::Shutdown(), it means we're waiting for something else in the media pipeline to shutdown. So we can gleam more information by looking at the other active threads' stacks. For example in the full log at: https://tbpl.mozilla.org/php/getParsedLog.php?id=34506820&tree=Mozilla-Inbound You can see a thread (31) blocked in StopDecodeThread(), and another thread (32) inside MediaDecoderStateMachine::DecodeThreadRun() that is blocked in GStreamerReader::ReadMetadata(). Which indicates Andrew's guess is on the money. ;) (Just an FYI, so you can more effectively correlate the failures that are happening).
These seem to have fallen way down the list again. Any idea why?
Today's Windows nightly hung for me with MediaShutdownManager::Shutdown() spinning and another thread blocked in AudioStream::Write(). Different bug or would you like the stacks here?
Flags: needinfo?(cpearce)
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #6) > Today's Windows nightly hung for me with MediaShutdownManager::Shutdown() > spinning and another thread blocked in AudioStream::Write(). Different bug > or would you like the stacks here? Windows and Linux have different AudioStream implementations, the Linux and Windows hanging cases are probably different, and should be different bugs. Incidentally, I've seen this hang sporadically on Windows, but I don't know how to reliably reproduce it.
Flags: needinfo?(cpearce)
Component: Audio/Video → Audio/Video: Playback
Mass closing do to inactivity. Feel free to re-open if still needed.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → INACTIVE
You need to log in before you can comment on or make changes to this bug.