Closed Bug 752796 Opened 12 years ago Closed 8 years ago

Intermittent test_streams_element_capture.html | Test timed out

Categories

(Core :: Audio/Video: Playback, defect, P5)

x86
Windows 7
defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
b2g-v2.2 --- disabled
b2g-master --- disabled

People

(Reporter: philor, Assigned: jwwang)

References

Details

(Keywords: intermittent-failure, Whiteboard: [test disabled on B2G Desktop])

Attachments

(1 file, 5 obsolete files)

https://tbpl.mozilla.org/php/getParsedLog.php?id=11554051&tree=Firefox
Rev3 WINNT 6.1 mozilla-central opt test mochitests-1/5 on 2012-05-07 20:16:12 PDT for push fafd873d469c

94398 INFO TEST-PASS | /tests/content/media/test/test_streams_element_capture.html | detodos.opus set output element .src correctly - [object MediaStream] should equal [object MediaStream]
94399 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_streams_element_capture.html | Test timed out.
(screenshot of a test pattern)
94400 INFO TEST-END | /tests/content/media/test/test_streams_element_capture.html | finished in 309138ms
https://tbpl.mozilla.org/php/getParsedLog.php?id=11528440&tree=Mozilla-Inbound was actually the first one, immediately after bug 750258 landed.
Here it looks like we're simply losing "ended" events on the media element that's consuming a stream.
https://tbpl.mozilla.org/php/getParsedLog.php?id=11968390&full=1&branch=mozilla-inbound#error0
We start four test instances. All four decoders reach nsBuiltinDecoder::PlaybackEnded. But it looks like for two of them (seek.webm and r11025_s16_c1.wav in this case) we don't get an ended event on the output media element.
Attached patch fix? (obsolete) — Splinter Review
This might fix it.
Assignee: nobody → roc
Attachment #626387 - Flags: review?
Attachment #626387 - Flags: review?(rjesup) → review+
https://hg.mozilla.org/mozilla-central/rev/0e1c786fd4c2
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla15
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
rjesup: Do you have any ideas about this whilst roc is away? Currently the #7 orange failure on mozilla-inbound.
I'll try to look friday.  In transit tomorrow
Comment on attachment 635188 [details] [diff] [review]
more logging

Review of attachment 635188 [details] [diff] [review]:
-----------------------------------------------------------------

Pretty easy review...  With the obvious proviso that we make sure we yank it later.
Attachment #635188 - Flags: review?(rjesup) → review+
roc, has the extra logging shed any light? :-)
Output with the extra logging:

{
102911 INFO TEST-START | /tests/content/media/test/test_streams_element_capture.html
102912 INFO TEST-INFO | /tests/content/media/test/test_streams_element_capture.html | Started Mon Jul 23 2012 19:27:40 GMT-0700 (Pacific Daylight Time) (1343096860.267s)
102913 INFO TEST-PASS | /tests/content/media/test/test_streams_element_capture.html | [started small-shot.ogg-0] Length of array should match number of running tests - 1 should equal 1
102914 INFO TEST-PASS | /tests/content/media/test/test_streams_element_capture.html | small-shot.ogg stream initial currentTime - 0 should equal 0
102915 INFO TEST-PASS | /tests/content/media/test/test_streams_element_capture.html | small-shot.ogg set output element .src correctly - [object MediaStream] should equal [object MediaStream]
102916 INFO TEST-PASS | /tests/content/media/test/test_streams_element_capture.html | [started r11025_s16_c1.wav-1] Length of array should match number of running tests - 2 should equal 2
102917 INFO TEST-PASS | /tests/content/media/test/test_streams_element_capture.html | r11025_s16_c1.wav stream initial currentTime - 0 should equal 0
102918 INFO TEST-PASS | /tests/content/media/test/test_streams_element_capture.html | r11025_s16_c1.wav set output element .src correctly - [object MediaStream] should equal [object MediaStream]
nsBuiltinDecoderStateMachine::RunStateMachine queuing nsBuiltinDecoder::PlaybackEnded
MediaStreamGraphImpl::FinishStream
nsBuiltinDecoder::PlaybackEnded mPlayState=3
nsBuiltinDecoderStateMachine::RunStateMachine queuing nsBuiltinDecoder::PlaybackEnded
MediaStreamGraphImpl::FinishStream
nsBuiltinDecoder::PlaybackEnded mPlayState=3
102919 INFO TEST-PASS | /tests/content/media/test/test_streams_element_capture.html | small-shot.ogg stream final currentTime - 0.27608489990234375 should equal 0.27608489990234375
102920 INFO TEST-PASS | /tests/content/media/test/test_streams_element_capture.html | small-shot.ogg current time at end: 0.27608489990234375 should be: 0.276
102921 INFO TEST-PASS | /tests/content/media/test/test_streams_element_capture.html | small-shot.ogg checking readyState - 2 should equal 2
102922 INFO TEST-PASS | /tests/content/media/test/test_streams_element_capture.html | small-shot.ogg checking playback has ended
102923 INFO TEST-PASS | /tests/content/media/test/test_streams_element_capture.html | [finished small-shot.ogg-0] Length of array should match number of running tests - 1 should equal 1
102924 INFO TEST-PASS | /tests/content/media/test/test_streams_element_capture.html | [started 320x240.ogv-2] Length of array should match number of running tests - 2 should equal 2
102925 INFO TEST-PASS | /tests/content/media/test/test_streams_element_capture.html | 320x240.ogv stream initial currentTime - 0 should equal 0
102926 INFO TEST-PASS | /tests/content/media/test/test_streams_element_capture.html | 320x240.ogv set output element .src correctly - [object MediaStream] should equal [object MediaStream]
MediaStreamGraphImpl::FinishStream
nsBuiltinDecoderStateMachine::RunStateMachine queuing nsBuiltinDecoder::PlaybackEnded
nsBuiltinDecoder::PlaybackEnded mPlayState=3
102927 INFO TEST-PASS | /tests/content/media/test/test_streams_element_capture.html | 320x240.ogv stream final currentTime - 0.2666635513305664 should equal 0.2666635513305664
102928 INFO TEST-PASS | /tests/content/media/test/test_streams_element_capture.html | 320x240.ogv current time at end: 0.2666635513305664 should be: 0.233
102929 INFO TEST-PASS | /tests/content/media/test/test_streams_element_capture.html | 320x240.ogv checking readyState - 2 should equal 2
102930 INFO TEST-PASS | /tests/content/media/test/test_streams_element_capture.html | 320x240.ogv checking playback has ended
102931 INFO TEST-PASS | /tests/content/media/test/test_streams_element_capture.html | Check video frame pixel has been drawn - 255 should equal 255
102932 INFO TEST-PASS | /tests/content/media/test/test_streams_element_capture.html | [finished 320x240.ogv-2] Length of array should match number of running tests - 1 should equal 1
102933 INFO TEST-PASS | /tests/content/media/test/test_streams_element_capture.html | [started seek.webm-3] Length of array should match number of running tests - 2 should equal 2
102934 INFO TEST-PASS | /tests/content/media/test/test_streams_element_capture.html | seek.webm stream initial currentTime - 0 should equal 0
102935 INFO TEST-PASS | /tests/content/media/test/test_streams_element_capture.html | seek.webm set output element .src correctly - [object MediaStream] should equal [object MediaStream]
nsBuiltinDecoderStateMachine::RunStateMachine queuing nsBuiltinDecoder::PlaybackEnded
nsBuiltinDecoder::PlaybackEnded mPlayState=3
102936 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_streams_element_capture.html | Test timed out.
}
Blocks: 664918
Target Milestone: mozilla15 → ---
Attached patch Disable test on Windows (obsolete) — Splinter Review
600 failures later we've apparently learnt nothing other than everybody is now ignoring this failing test. 

Enough is enough (particularly when we'll be stuck with this for another year thanks to ESR17).
Attachment #653107 - Flags: review?(philringnalda)
Comment on attachment 653107 [details] [diff] [review]
Disable test on Windows

Isn't _reset.html a completely separate test which is admittedly bad about timing out, but that's bug 750769 and not Windows-specific?
Attachment #653107 - Flags: review?(philringnalda) → review?(roc)
(In reply to Phil Ringnalda (:philor) from comment #614)
> Comment on attachment 653107 [details] [diff] [review]
> Disable test on Windows
> 
> Isn't _reset.html a completely separate test which is admittedly bad about
> timing out, but that's bug 750769 and not Windows-specific?

Ah, I had just thought it was the cleanup counter-part to test_streams_element_capture.html, I'm quite likely wrong.
Comment on attachment 653107 [details] [diff] [review]
Disable test on Windows

Both tests are failing multiple times per day on Windows.
Attachment #653107 - Flags: review?(roc) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/1fa9e807f7f8
Summary: Intermittent test_streams_element_capture.html | Test timed out → [disabled on win32] Intermittent test_streams_element_capture.html | Test timed out
Whiteboard: [orange] → [orange][test disabled]
https://hg.mozilla.org/mozilla-central/rev/1fa9e807f7f8
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla18
I forgot to set [leave open].
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Blocks: 801293
Whiteboard: [orange][test disabled] → [test disabled]
Comment on attachment 653107 [details] [diff] [review]
Disable test on Windows

(To get this off of the bugzilla-todos "to check in" list)
Attachment #653107 - Flags: checkin+
Thanks for tracking that down, Edwin.
https://hg.mozilla.org/mozilla-central/rev/b0f89cf3ede6
Status: REOPENED → RESOLVED
Closed: 12 years ago11 years ago
Resolution: --- → FIXED
I'm wanting to keep this on radar for consideration if there is a 25 dot release, because b0f89cf3ede6 fixed a 25 regression in the Construct 2 game engine, identified in bug 932400 comment 2 and 6.
Assignee: roc → edwin
Target Milestone: mozilla18 → mozilla26
This test was re-enabled as part of bug 935343; though is now intermittently failing on Windows 8.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Summary: [disabled on win32] Intermittent test_streams_element_capture.html | Test timed out → Intermittent test_streams_element_capture.html | Test timed out
Whiteboard: [test disabled]
In a very suspicious across-multiple-trees-at-one-instant way.
(In reply to Ed Morley [:edmorley UTC+0] from comment #714)
> This test was re-enabled as part of bug 935343; though is now intermittently
> failing on Windows 8.

I don't see any re-enabling in that patch, and it was also a long time ago.

The recent regression seems likely to be from bug 977314 ,
https://hg.mozilla.org/mozilla-central/rev/26bfe4ef1bc2
Flags: needinfo?(cpearce)
I can repro this locally, usually in gizmo.mp4, but not when my patches for bug 973408 are applied.

It seems we're entering BUFFERING state, and when we switch back to DECODING state we're not awakening the decode thread, so we can't advance playback, and the "ended" event never fires, so the test times out.

If we NotifyAll the decoder monitor when mDecodeThreadWaiting is true in ScheduleDecodeThread() we fix the problem.

Since my patches in bug 973408 remove ScheduleDecodeThread() and change our decoding logic to make us immune to this kind of failure, I suggest I finish cleaning those patches up, and get them reviewed and landed instead of fixing this directly.
Depends on: 973408
Flags: needinfo?(cpearce)
(In reply to TBPL Robot from comment #817)

16:37:47     INFO -  08-11 23:29:52.140 W/libOpenSLES(  754): Leaving BufferQueue::Enqueue (SL_RESULT_BUFFER_INSUFFICIENT)

How could that be possible?
(In reply to TBPL Robot from comment #821)

09:30:37     INFO -  Crash reason:  SIGSEGV
09:30:37     INFO -  Crash address: 0x5a5a5a5a
09:30:37     INFO -  Thread 0 (crashed)
09:30:37     INFO -   0  libc.so!pthread_mutex_lock [pthread.c : 1036 + 0x0]
09:30:37     INFO -       r4 = 0x5a5a5a5a    r5 = 0x00000000    r6 = 0x40202580    r7 = 0x00000000
09:30:37     INFO -       r8 = 0xbebfac7f    r9 = 0x00000001   r10 = 0x00000001    fp = 0xbebfb900
09:30:37     INFO -       sp = 0xbebfaba0    lr = 0x400287bb    pc = 0x4006e248
09:30:37     INFO -      Found by: given as instruction pointer in context
09:30:37     INFO -   1  libmozglue.so!__wrap_pthread_mutex_lock [Nuwa.cpp:7e167c1a6d53 : 1155 + 0x5]
09:30:37     INFO -       r4 = 0x00000000    r5 = 0x00000000    r6 = 0x40202580    r7 = 0x00000000
09:30:37     INFO -       r8 = 0xbebfac7f    r9 = 0x00000001   r10 = 0x00000001    fp = 0xbebfb900
09:30:37     INFO -       sp = 0xbebfabc0    pc = 0x400287bb
09:30:37     INFO -      Found by: call frame info
09:30:37     INFO -   2  libnss3.so!PR_Lock [ptsynch.c:7e167c1a6d53 : 177 + 0x3]
09:30:37     INFO -       r4 = 0x5a5a5a5a    r5 = 0x00000000    r6 = 0x40202580    r7 = 0x00000000
09:30:37     INFO -       r8 = 0xbebfac7f    r9 = 0x00000001   r10 = 0x00000001    fp = 0xbebfb900
09:30:37     INFO -       sp = 0xbebfabe0    pc = 0x406ea891
09:30:37     INFO -      Found by: call frame info
09:30:37     INFO -   3  libxul.so!mozilla::MediaStreamGraphImpl::RunInStableState(bool) [Mutex.h : 69 + 0x5]
09:30:37     INFO -       r4 = 0x442cff00    r5 = 0x00000000    r6 = 0x40202580    r7 = 0x00000000
09:30:37     INFO -       r8 = 0xbebfac7f    r9 = 0x00000001   r10 = 0x00000001    fp = 0xbebfb900
09:30:37     INFO -       sp = 0xbebfabe8    pc = 0x4141f94f
09:30:37     INFO -      Found by: call frame info
09:30:37     INFO -   4  libxul.so!mozilla::::MediaStreamGraphStableStateRunnable::Run [MediaStreamGraph.cpp:7e167c1a6d53 : 1512 + 0x7]
09:30:37     INFO -       r4 = 0x40202550    r5 = 0x00000000    r6 = 0x40202580    r7 = 0x00000000
09:30:37     INFO -       r8 = 0xbebfac7f    r9 = 0x00000001   r10 = 0x00000001    fp = 0xbebfb900
09:30:37     INFO -       sp = 0xbebfac20    pc = 0x4141fbc7
09:30:37     INFO -      Found by: call frame info
09:30:37     INFO -   5  libxul.so!nsThread::ProcessNextEvent(bool, bool*) [nsThread.cpp:7e167c1a6d53 : 770 + 0x5]
09:30:37     INFO -       r4 = 0x40202550    r5 = 0x00000000    r6 = 0x40202580    r7 = 0x00000000
09:30:37     INFO -       r8 = 0xbebfac7f    r9 = 0x00000001   r10 = 0x00000001    fp = 0xbebfb900
09:30:37     INFO -       sp = 0xbebfac28    pc = 0x40ab4ed1
09:30:37     INFO -      Found by: call frame info
09:30:37     INFO -   6  libxul.so!NS_ProcessNextEvent(nsIThread*, bool) [nsThreadUtils.cpp:7e167c1a6d53 : 265 + 0xb]
09:30:37     INFO -       r4 = 0x00000001    r5 = 0x00000000    r6 = 0x431b2b70    r7 = 0x00000000
09:30:37     INFO -       r8 = 0xbebfad07    r9 = 0x00000001   r10 = 0x00000001    fp = 0xbebfb900
09:30:37     INFO -       sp = 0xbebfac78    pc = 0x40abff69
09:30:37     INFO -      Found by: call frame info
09:30:37     INFO -   7  libxul.so!nsThread::Shutdown() [nsThread.cpp:7e167c1a6d53 : 599 + 0x7]
09:30:37     INFO -       r4 = 0x4536e900    r5 = 0x00000000    r6 = 0x431b2b70    r7 = 0x00000000
09:30:37     INFO -       r8 = 0xbebfad07    r9 = 0x00000001   r10 = 0x00000001    fp = 0xbebfb900
09:30:37     INFO -       sp = 0xbebfac88    pc = 0x40ab487f
09:30:37     INFO -      Found by: call frame info

It is suspicious the event is running during thread shutdown and results in use-after-free.
(In reply to JW Wang [:jwwang] from comment #822)
> (In reply to TBPL Robot from comment #821)
> 
> 09:30:37     INFO -  Crash reason:  SIGSEGV
> 09:30:37     INFO -  Crash address: 0x5a5a5a5a

> It is suspicious the event is running during thread shutdown and results in
> use-after-free.

This is probably due to the patch running on Try, and not this bug.  Sorry for the mis-star.  I think we've resolved that one in the MSG refactor which the Try was done for since then.
Let's add some logs for debugging.
Attachment #8554967 - Flags: review?(cpearce)
Hi Edwin,
Can I take this bug to follow up?
Flags: needinfo?(edwin)
(In reply to JW Wang [:jwwang] from comment #1122)
> Hi Edwin,
> Can I take this bug to follow up?

Please do! :)
Flags: needinfo?(edwin)
Assignee: edwin → jwwang
Attachment #8554967 - Flags: review?(cpearce) → review+
Please check in  752796_log_media_element_events.patch only. Thanks.
I think it's time to just disable this on B2G Desktop. Fixing it is obviously going nowhere.
Flags: needinfo?(jwwang)
Truly but sadly... Please disable it on B2G desktop.
Flags: needinfo?(jwwang)
Flags: needinfo?(ryanvm)
https://hg.mozilla.org/integration/mozilla-inbound/rev/ed1ba3752db1
Flags: needinfo?(ryanvm)
Keywords: leave-open
Whiteboard: [test disabled on B2G Desktop][leave open]
Target Milestone: mozilla26 → ---
Seeing comment 1338 ~ comment 1345, since finish() is called multiple times, it looks like the test can't finish in time. Let's see if longer timeout would help.
Attachment #626387 - Attachment is obsolete: true
Attachment #635188 - Attachment is obsolete: true
Attachment #653107 - Attachment is obsolete: true
Attachment #800460 - Attachment is obsolete: true
Attachment #8554967 - Attachment is obsolete: true
Attachment #8597838 - Flags: review?(cpearce)
Attachment #8597838 - Flags: review?(cpearce) → review+
Thanks for the review.
Component: Audio/Video → Audio/Video: Playback
P5
Keywords: leave-open
Priority: -- → P5
Whiteboard: [test disabled on B2G Desktop][leave open] → [test disabled on B2G Desktop]
No timeout for a long time. Should it happen again, open a new bug since code changes a lot since then.
Status: REOPENED → RESOLVED
Closed: 11 years ago8 years ago
Keywords: leave-open
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: