Intermittent timeout of content/media/test/test_seek.html on Linux64

VERIFIED FIXED

Status

()

Core
Audio/Video
VERIFIED FIXED
7 years ago
5 years ago

People

(Reporter: cpearce, Assigned: kinetik)

Tracking

({intermittent-failure})

Trunk
x86_64
Linux
intermittent-failure
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

(Reporter)

Description

7 years ago
test_seek failed on Linux64, 1 test-run after the landing of bug 610570. The failure was in seek5.js on bug516323.indexed.ogv, which relies on playing the media through after a seek and ending the test when we receive an ended event. Since bug 610570 changed the buffering logic, it's *possible* we were stuck in buffering during playback after seek, and so never got the ended event.

Or it could be something else. This is the first time this test has failed in a while, so I'm filing a new bug for it.

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1290981667.1290984079.21600.gz

Build Log (Brief)
Rev3 Fedora 12x64 mozilla-central debug test mochitests-1/5 on 2010/11/28 14:01:07
	
SEEK-TEST: Finished bug516323.indexed.ogv seek test 3
57892 INFO TEST-PASS | /tests/content/media/test/test_seek.html | bug516323.indexed.ogv seek test 4: Video currentTime should be around 1.052: 1.0520000457763672
57893 INFO TEST-PASS | /tests/content/media/test/test_seek.html | bug516323.indexed.ogv seek test 4: Second seek on target: 1.0520000457763672
SEEK-TEST: Finished bug516323.indexed.ogv seek test 4
SEEK-TEST: Started bug516323.indexed.ogv seek test 5
SEEK-TEST: Started bug516323.indexed.ogv seek test 6
WARNING: NS_ENSURE_TRUE(mIsPending) failed: file /builds/slave/mozilla-central-linux64-debug/build/netwerk/protocol/http/nsHttpChannel.cpp, line 3544
WARNING: NS_ENSURE_TRUE(mIsPending) failed: file /builds/slave/mozilla-central-linux64-debug/build/netwerk/protocol/http/nsHttpChannel.cpp, line 3544
WARNING: NS_ENSURE_TRUE(mSuspendCount > 0) failed: file /builds/slave/mozilla-central-linux64-debug/build/netwerk/protocol/http/nsHttpChannel.cpp, line 3561
57894 INFO TEST-PASS | /tests/content/media/test/test_seek.html | bug516323.indexed.ogv seek test 5: Video currentTime should be around 2.104: 2.1040000915527344
57895 INFO TEST-PASS | /tests/content/media/test/test_seek.html | bug516323.indexed.ogv seek test 6: Seek 1
57896 INFO TEST-PASS | /tests/content/media/test/test_seek.html | bug516323.indexed.ogv seek test 6: Seek 2
57897 INFO TEST-PASS | /tests/content/media/test/test_seek.html | bug516323.indexed.ogv seek test 6: Seek 3
Seek test 6 time sum:23.735999822616577
SEEK-TEST: Finished bug516323.indexed.ogv seek test 6
57898 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_seek.html | Test timed out.
Comment hidden (Treeherder Robot)
(Reporter)

Comment 2

7 years ago
I had nthomas spool up 6 extra mochitest1/5 runs on my push, and 1 of the 6 failed with this. So It's failed on Linux x64 in 2 of 8 runs so far. It hasn't failed on any other platforms yet.
Hardware: x86 → x86_64
(Reporter)

Updated

7 years ago
Assignee: nobody → chris
(Reporter)

Comment 3

7 years ago
Two more failures, reported as bug 503623, but actually this bug.

dao%mozilla.com
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1291036660.1291038945.15767.gz
Rev3 Fedora 12x64 mozilla-central debug test mochitests-1/5 on 2010/11/29
05:17:40

s: talos-r3-fed64-034
57911 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_seek.html |
Test timed out.

Comment 283 TinderboxPushlog Robot 2010-11-29 07:58:55 PST

ehsan%mozilla.com
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1291043777.1291046140.21083.gz
Rev3 Fedora 12x64 mozilla-central debug test mochitests-1/5 on 2010/11/29
07:16:17

s: talos-r3-fed64-039
57886 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_seek.html |
Test timed out.
Summary: Intermittent failure of test_seek.html on Linux64 → Intermittent timeout of content/media/test/test_seek.html on Linux64
(Assignee)

Comment 4

7 years ago
With the ALSA backend, there's a data start threshold that must be reached before stream playback begins.  It defaults to ~500ms on my machine (not 375ms as I guessed in bug 615452).

It looks like what's happening here is that after seeking we write a small amount of audio, then enter buffering state, then eventually resume writing audio.  Between writing the first bit of audio and the second, the stream underruns and, post recovery, the amount of audio written before waiting for the clock to advance isn't sufficient to start the stream playing.

It's necessary to write sufficient audio to reach the start threshold when the stream is first opened, and any time the stream suffers an underrun for any reason.  Note that the current design doesn't provide a way to signal underruns back to the playback engine.

This timeout can be avoided by changing AMPLE_AUDIO_MS back to 2000 (it was changed to 1000 in bug 610570, which revealed this problem).  That's not a guaranteed fix, and won't help in any case where we need to recover from an underrun.

An ideal short-term solution would be to set the start threshold to 1, which would provide similar behaviour to the Win32 sydneyaudio backend.  It's possible to reduce the start threshold, but the minimum it can be set to (on my machine, at least) is ~125ms.  Anything lower results in cascading underruns, as the start threshold must be not smaller than the stream's period.  This is also not a guaranteed fix, but it makes recovery more likely in any case where it's necessary.  With this change, I can run 5 tabs of test_seek in parallel for an hour without a hang, where they would previously hang within minutes.  I'll need to test this change on the N900 to make sure it doesn't regress bug 607200.

A further possible improvement would be to expose the start threshold from sydneyaudio, and modify the playback engines to write at least that much audio data before expecting the stream to begin playback.

The post FF4 audio rewrite can deal with this correctly by design, but that doesn't solve the problem now.
Comment hidden (Treeherder Robot)
(Assignee)

Comment 6

7 years ago
Created attachment 494530 [details] [diff] [review]
patch v0

This ran 5 tabs of test_seek running continuously overnight without hanging.  It's not a complete fix, but a complete fix is difficult and will have to wait for the post FF4 audio rewrite.

This patch also removes the underrun recovery from get_position, because it's no longer necessary since we can return last_pos (and would after recovery anyway).  Since this function is called without the audio monitor held and called from the playback thread, recovering from underrun here is dangerous because it can race with whatever the audio thread is doing.

Will request review once I've tested on Maemo5/N900.
Assignee: chris → kinetik
Status: NEW → ASSIGNED
(Assignee)

Comment 7

7 years ago
Comment on attachment 494530 [details] [diff] [review]
patch v0

Tested on the N900 and with direct ALSA (no PA) on my desktop, works fine.  Mochitests pass on desktop.
Attachment #494530 - Flags: review?(chris)
Attachment #494530 - Flags: approval2.0?
(Reporter)

Updated

7 years ago
Attachment #494530 - Flags: review?(chris) → review+
Attachment #494530 - Flags: approval2.0? → approval2.0+
(Assignee)

Updated

7 years ago
Whiteboard: [orange] → [orange][needs landing]
Comment hidden (Treeherder Robot)
(Reporter)

Comment 9

7 years ago
http://hg.mozilla.org/mozilla-central/rev/d748232921a5
Status: ASSIGNED → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → FIXED
Whiteboard: [orange][needs landing] → [orange]
Status: RESOLVED → VERIFIED
Comment hidden (Treeherder Robot)
Keywords: intermittent-failure
Whiteboard: [orange]
You need to log in before you can comment on or make changes to this bug.