Closed Bug 852821 Opened 11 years ago Closed 8 years ago

Intermittent dom/media/test/crashtests/691096-1.html | application timed out after 330 seconds with no output

Categories

(Core :: Audio/Video: Playback, defect, P3)

ARM
Gonk (Firefox OS)
defect

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: RyanVM, Unassigned)

References

Details

(Keywords: intermittent-failure, Whiteboard: [test disabled on B2G][leave open])

This regressed at some point within the last two days and is occurring quite frequently at the moment (though it's been accidentally starred as bug 818103 enough times now that it's hard to tell exactly when it started).

https://tbpl.mozilla.org/php/getParsedLog.php?id=20845392&tree=Firefox

b2g_ics_armv7a_gecko_emulator mozilla-central opt test crashtest-1 on 2013-03-19 14:33:24 PDT for push 453ccf5b5d29
slave: talos-r3-fed-086

14:54:34     INFO -  REFTEST TEST-START | http://10.0.2.2:8888/tests/content/media/test/crashtests/691096-1.html | 470 / 780 (60%)
14:54:34  WARNING -  TEST-UNEXPECTED-FAIL | http://10.0.2.2:8888/tests/content/media/test/crashtests/691096-1.html | application timed out after 330 seconds with no output
14:54:34     INFO -  INFO | automation.py | Application ran for: 0:12:35.846910
14:54:34     INFO -  INFO | automation.py | Reading PID log: /tmp/tmpEiO7E6pidlog
14:54:55     INFO -  WARNING | leakcheck | refcount logging is off, so leaks can't be detected!
14:54:55     INFO -  REFTEST INFO | runreftest.py | Running tests: end.
14:55:03    ERROR - Return code: 1
Probably bug 848581, but that only landed on m-c about 10 hours ago.

Full log shows:
14:55:05     INFO -  E/libOpenSLES(  792): Too many objects
14:55:05     INFO -  W/libOpenSLES(  792): Leaving Engine::CreateAudioPlayer (SL_RESULT_MEMORY_FAILURE)
14:55:05     INFO -  W/libOpenSLES(  792): system/media/wilhelm/src/sles.c:501: pthread 0x84f00 (tid 1090) sees object 0x468b8c00 was locked by pthread 0x84e80 (tid 1088) at system/media/wilhelm/src/sles.c:501
14:55:05     INFO -  W/libOpenSLES(  792): Leaving BufferQueue::Enqueue (SL_RESULT_PARAMETER_INVALID)

...repeated multiple times.

This test (attempting to play 250 audio elements at once) has been problematic with other cubeb backends too, usually resulting in a hard limit on the number of cubeb_streams (CUBEB_STREAM_MAX in cubeb_alsa and cubeb_winmm).  Perhaps we need the same thing in the cubeb_opensl backend?
Blocks: 848581
I checked the source. It looks like there's a hardcoded limit of 32 audio players per engine. Only one engine is permitted.
IIRC, AudioFlinger's soft mixer has a limit of 32 audio tracks.
(In reply to Sotaro Ikeda [:sotaro] from comment #9)
> IIRC, AudioFlinger's soft mixer has a limit of 32 audio tracks.

http://androidxref.com/4.0.4/xref/frameworks/base/services/audioflinger/AudioMixer.h#43
I put up a patch in bug 853077 which may fix this. The 32 audio track limit is probably unrelated. We fail to initialize the stream when that happens, which is exactly what other backends do when they hit the limit.
A memleak fix for this backend just landed in https://bugzilla.mozilla.org/show_bug.cgi?id=852966 - hopefully this will reduce the amount of failure. If not, we'll see if bug 853077 fixes the failure, but that seems less likely..
(In reply to Michael Wu [:mwu] from comment #12)
> A memleak fix for this backend just landed in
> https://bugzilla.mozilla.org/show_bug.cgi?id=852966 - hopefully this will
> reduce the amount of failure. If not, we'll see if bug 853077 fixes the
> failure, but that seems less likely..

I'm still seeing this failure quite frequently on inbound. Is there anything else we can do to fix? This is one of the bugs blocking reftests from being unhidden.
Is there any way to get a stack when it times out?

Since mwu's patches landed, all of the errors I listed in comment 1 has disappeared now except the "Too many objects"/"Leaving Engine::CreateAudioPlayer (SL_RESULT_MEMORY_FAILURE)" pair that we expect to see when hitting the internal limits.
I believe we need to implement the B2G equivalent of killAndGetStack from http://mxr.mozilla.org/mozilla-central/source/build/automation.py.in#1017; I'll file a bug for this.
Another useful thing would be to output ps so we can catch things like mediaserver crashing.
Much as I love stupidly wasting resources while waiting for something perfect that isn't ever going to happen (like any mozillian, that much is of course "a whole lot"), I'm sick of this holding an entire crashtest hunk hostage, so I landed a skip-if in https://hg.mozilla.org/integration/mozilla-inbound/rev/e79da89e3bc6
Whiteboard: [leave open][test disabled on b2g]
(In reply to Phil Ringnalda (:philor) from comment #121)
> Much as I love stupidly wasting resources while waiting for something
> perfect that isn't ever going to happen (like any mozillian, that much is of
> course "a whole lot"), I'm sick of this holding an entire crashtest hunk
> hostage, so I landed a skip-if in
> https://hg.mozilla.org/integration/mozilla-inbound/rev/e79da89e3bc6

Thanks philor.
Summary: Frequent B2G content/media/test/crashtests/691096-1.html | application timed out after 330 seconds with no output → Intermittent content/media/test/crashtests/691096-1.html | application timed out after 330 seconds with no output
Whiteboard: [leave open][test disabled on b2g]
(In reply to Jonathan Griffin (:jgriffin) from comment #93)
> I believe we need to implement the B2G equivalent of killAndGetStack from
> http://mxr.mozilla.org/mozilla-central/source/build/automation.py.in#1017;
> I'll file a bug for this.

What was the bug number for this?
Flags: needinfo?(jgriffin)
Whiteboard: [leave open]
bug 862730
Flags: needinfo?(jgriffin)
Depends on: 862730
I've been banging my head against this over the last few days because it appeared to block bug 1142336 and bug 1145686. But then I noticed that it's actually near-permaorange anyway without my changes. Ugh.

Here's some logs I captured:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b576ce6cad73
https://treeherder.mozilla.org/#/jobs?repo=try&revision=c48e50305a0a

I'm going to disable this test on emulator builds and land my stuff.
Summary: Intermittent content/media/test/crashtests/691096-1.html | application timed out after 330 seconds with no output → Intermittent dom/media/test/crashtests/691096-1.html | application timed out after 330 seconds with no output
Whiteboard: [leave open] → [test disabled on B2G][leave open]
See bug 1048926 comment 3.
I guess we can use the same fix for this bug.
Component: Audio/Video → Audio/Video: Playback
Bulk assigning P3 to all open intermittent bugs without a priority set in Firefox components per bug 1298978.
Priority: -- → P3
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.