Closed Bug 908462 Opened 11 years ago Closed 11 years ago

long CC delays with DeadlockDetector during webaudio mochitests

Categories

(Core :: Web Audio, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla26

People

(Reporter: karlt, Assigned: khuey)

References

Details

(Whiteboard: [blocking-webaudio-])

Attachments

(1 file)

These are timings for some webaudio mochitests with a debug build on my desktop Linux system.
Note variability of 30 seconds or so.
I haven't looked at tinderbox timings yet.

/tests/content/media/webaudio/test/test_pannerNode_equalPower.html
2275ms
25008ms
30080ms
19739ms
30282ms
3127ms
2815ms

/tests/content/media/webaudio/test/test_scriptProcessorNodeChannelCount.html
37544ms
1477ms
418ms
/tests/content/media/webaudio/test/test_scriptProcessorNodeZeroInputOutput.html

32745ms
414ms
376ms

When the tests take longer, the main thread is busy occupying one CPU.

Attaching a debugger usually finds the thread in DestroyMediaStream in cycle collection.  nsCycleCollector::CollectWhite() takes several seconds to finish.
CCing the CC folks.  :-)

We suspect that this bug is the reason why we had to disable these tests on Firefox OS, which is very unfortunate.
Blocks: 908306
CollectWhite means that there is tons of garbage to collect.
Well, also some unusual/slow Unlink implementation might cause slowness.
If you uncomment #define COLLECT_TIME_DEBUG in xpcom/base/nsCycleCollector.cpp then you'll get more detailed timing information dumped to printf, though I'm not sure if that is viewable in a useful way on B2G.
Though if the time is mostly being spent in DestroyMediaStream, that hardly seems like the cycle collector's fault. ;)
Indeed. AudioNode ends up calling DestroyMediaStream in Unlink.
DestroyMediaStream has MutexAutoLock so it may need to wait for the lock...
(In reply to comment #6)
> Indeed. AudioNode ends up calling DestroyMediaStream in Unlink.
> DestroyMediaStream has MutexAutoLock so it may need to wait for the lock...

Do we actually end up grabbing the lock?  If yes, for which node types?
I've seen stacks with GainNode and AudioBufferSourceNode taking the lock during Unlink(), but they are in BlockingResourceBase::CheckAcquire(), not waiting for the lock, and the business of the main thread suggests there are no long waits for the lock.  I have only a small sample.  I need to disable DeadlockDetector to check whether that is slowing things down.  Finishing each DeadlockDetector function seems fast, but if there are many objects then the costs may be adding up.

I assume DeadlockDetector is not enabled on b2g opt builds.

e.g.
#0  0x00007f799cadc775 in nsTArray_base<nsTArrayInfallibleAllocator, nsTArray_CopyElements<PLHashEntry*> >::Length (this=0x7f78c57b7e18)
    at /home/karl/moz/dev/xpcom/build/../glue/nsTArray.h:371
#1  0x00007f799cadcb73 in nsTArray_Impl<PLHashEntry*, nsTArrayInfallibleAllocator>::BinaryIndexOf<PLHashEntry const*, nsDefaultComparator<PLHashEntry*, PLHashEntry const*> > (this=0x7f78c57b7e18, item=@0x7fffe1abe698: 0x4bad7e0, 
    comp=...) at /home/karl/moz/dev/xpcom/build/../glue/nsTArray.h:1019
#2  0x00007f799cadc763 in nsTArray_Impl<PLHashEntry*, nsTArrayInfallibleAllocator>::BinaryIndexOf<PLHashEntry const*> (this=0x7f78c57b7e18, 
    item=@0x7fffe1abe698: 0x4bad7e0)
    at /home/karl/moz/dev/xpcom/build/../glue/nsTArray.h:1039
#3  0x00007f799cadc0d3 in mozilla::DeadlockDetector<mozilla::BlockingResourceBase::DeadlockDetectorEntry>::IsOrdered (this=0x894920, aFirst=0x7f78c4391a80, 
    aSecond=0x4bad7e0) at ../../dist/include/mozilla/DeadlockDetector.h:273
#4  0x00007f799cadbd85 in mozilla::DeadlockDetector<mozilla::BlockingResourceBase::DeadlockDetectorEntry>::InTransitiveClosure (this=0x894920, 
    aStart=0x7f78c4391a80, aTarget=0x4bad7e0)
    at ../../dist/include/mozilla/DeadlockDetector.h:460
#5  0x00007f799cadbde0 in mozilla::DeadlockDetector<mozilla::BlockingResourceBase::DeadlockDetectorEntry>::InTransitiveClosure (this=0x894920, 
    aStart=0x7f78c4016840, aTarget=0x4bad7e0)
    at ../../dist/include/mozilla/DeadlockDetector.h:467
#6  0x00007f799cadbde0 in mozilla::DeadlockDetector<mozilla::BlockingResourceBase::DeadlockDetectorEntry>::InTransitiveClosure (this=0x894920, 
    aStart=0x419a240, aTarget=0x4bad7e0)
    at ../../dist/include/mozilla/DeadlockDetector.h:467
#7  0x00007f799cadbde0 in mozilla::DeadlockDetector<mozilla::BlockingResourceBase::DeadlockDetectorEntry>::InTransitiveClosure (this=0x894920, 
    aStart=0x92fbc0, aTarget=0x4bad7e0)
    at ../../dist/include/mozilla/DeadlockDetector.h:467
#8  0x00007f799cadb861 in mozilla::DeadlockDetector<mozilla::BlockingResourceBase::DeadlockDetectorEntry>::CheckAcquisition (this=0x894920, aLast=0x92fb80, 
    aProposed=0x39e8150, aCallContext=...)
    at ../../dist/include/mozilla/DeadlockDetector.h:428
#9  0x00007f799cadab1a in mozilla::BlockingResourceBase::CheckAcquire (
    this=0x39e8100, aCallContext=...)
    at /mnt/sda11/karl/obj/xpcom/build/BlockingResourceBase.cpp:104
#10 0x00007f799cadb106 in mozilla::OffTheBooksMutex::Lock (this=0x39e8100)
    at /mnt/sda11/karl/obj/xpcom/build/BlockingResourceBase.cpp:224
#11 0x00007f799a8b8d92 in mozilla::BaseAutoLock<mozilla::Mutex>::BaseAutoLock
    (this=0x7fffe1abe9f0, aLock=..., _notifier=...)
    at ../../dist/include/mozilla/Mutex.h:173
#12 0x00007f799b5cb388 in mozilla::dom::AudioNode::DestroyMediaStream (
    this=0x517b1e0)
    at /home/karl/moz/dev/content/media/webaudio/AudioNode.cpp:316
#13 0x00007f799b5ca9c5 in mozilla::dom::AudioNode::DisconnectFromGraph (
    this=0x517b1e0)
    at /home/karl/moz/dev/content/media/webaudio/AudioNode.cpp:136
#14 0x00007f799b5ca1d3 in mozilla::dom::AudioNode::cycleCollection::Unlink (
    this=0x7f799f46e540 <mozilla::dom::GainNode::_cycleCollectorGlobal>, 
    p=0x517b1e0) at /home/karl/moz/dev/content/media/webaudio/AudioNode.cpp:21
#15 0x00007f799b5dc773 in mozilla::dom::GainNode::cycleCollection::Unlink (
    this=0x7f799f46e540 <mozilla::dom::GainNode::_cycleCollectorGlobal>, 
    p=0x517b1e0) at /home/karl/moz/dev/content/media/webaudio/GainNode.cpp:17
#16 0x00007f799cb78135 in nsCycleCollector::CollectWhite (this=0x927a20, 
    aListener=0x0) at /home/karl/moz/dev/xpcom/base/nsCycleCollector.cpp:2568
(In reply to comment #8)
> I've seen stacks with GainNode and AudioBufferSourceNode taking the lock during
> Unlink(), but they are in BlockingResourceBase::CheckAcquire(), not waiting for
> the lock, and the business of the main thread suggests there are no long waits
> for the lock.  I have only a small sample.  I need to disable DeadlockDetector
> to check whether that is slowing things down.  Finishing each DeadlockDetector
> function seems fast, but if there are many objects then the costs may be adding
> up.

It _could_ be the deadlock detector which is what's expensive here.

> I assume DeadlockDetector is not enabled on b2g opt builds.

Let's hope not!
Whiteboard: [blocking-webaudio-]
With the DeadlockDetector disabled in this way, I no longer observed the delays.
But since https://hg.mozilla.org/mozilla-central/rev/30ef08a0a1bc I no longer see these delays even with the DeadlockDetector enabled.

Did Bug 901630 do something more than remove unused code?
Status: NEW → RESOLVED
Closed: 11 years ago
Depends on: 901630
Flags: needinfo?(khuey)
Resolution: --- → WORKSFORME
Attachment #797153 - Attachment description: disable DeadlockDetector exteriment → disable DeadlockDetector experiment
But this bug clearly exists now, I don't see how bug 901630 might have fixed this.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
I don't see any way bug 901630 would affect this.

How big is the array we're searching here.  Perhaps there is some sort of pathological behavior in the webaudio code that causes it to get huge.
Flags: needinfo?(khuey)
I'm confident 30ef08a0a1bc made a difference here.  I tested with it backed out and reapplied.  If there is no used code that changed, then that is probably more concerning as it suggests the change in memory layout changed the behaviour.
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #13)
> How big is the array we're searching here.  Perhaps there is some sort of
> pathological behavior in the webaudio code that causes it to get huge.

Going from not entirely reliable memory here, there was an array or set of length about 1000 somewhere.  It wasn't the array at the bottom of the stack in comment 8, but it was below DestroyMediaStream().

This is the kind of output from COLLECT_TIME_DEBUG

cc: ForgetSkippable() took 25ms
cc: PrepareForCollection() took 11ms
cc: mJSRuntime->BeginCycleCollection() took 1ms
cc: SelectPurple() took 3ms
cc: MarkRoots() took 121ms
cc: ScanRoots() took 7ms
cc: CollectWhite::Root took 16ms
cc: CollectWhite::Unlink took 55126ms
cc: CollectWhite::Unroot took 40ms
cc: CollectWhite() took 55183ms
cc: total cycle collector time was 55330ms
cc: visited 12291 ref counted and 10407 GCed objects, freed 11867 ref counted and 10276 GCed objects.
So, it is actually conceivable that bug 901630 affected something.

We were holding the (long since obsoleted) lock at https://hg.mozilla.org/mozilla-central/rev/30ef08a0a1bc#l9.280 across unlinking.  If webaudio creates lots of locks it's possible that the deadlock detector had a rather long array of seen orderings each of the form CC Lock -> Web Audio Lock for every Web Audio Lock that ever existed.
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #12)
> But this bug clearly exists now.

Comment 16 sounds like a plausible explanation for what I was seeing in comment 0, thank you.  Unfortunately bug 908306 proves that there are other issues, but it looks like the analysis in this bug is not going to help there.  We still have bug 908306 open, or we can open new bugs for new STR when we find them.
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
Summary: long CC delays during webaudio mochitests → long CC delays with DeadlockDetector during webaudio mochitests
Ah, right, I didn't realize that the CCRunner held a lock even in single threaded mode!
Well to be fair that is pretty ridiculous ;-)
Assignee: nobody → khuey
Target Milestone: --- → mozilla26
Depends on: 1027921
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: