Closed Bug 981153 Opened 10 years ago Closed 9 years ago

Intermittent test_seek.html | Test timed out.

Categories

(Core :: Audio/Video, defect)

defect
Not set
normal

Tracking

()

RESOLVED WONTFIX
Tracking Status
firefox31 --- disabled
firefox32 --- disabled
firefox33 --- disabled
firefox-esr24 --- unaffected
b2g-v2.0 --- disabled
b2g-v2.1 --- disabled

People

(Reporter: KWierso, Assigned: jwwang)

References

Details

(Keywords: intermittent-failure, Whiteboard: [test disabled])

Attachments

(2 files)

https://tbpl.mozilla.org/php/getParsedLog.php?id=35814069&tree=Mozilla-Inbound
slave: tst-linux32-spot-147



14:12:23     INFO -  4021 INFO TEST-PASS | /tests/content/media/test/test_seek.html | [finished seek.ogv-27] Length of array should match number of running tests
14:12:23     INFO -  4022 INFO TEST-PASS | /tests/content/media/test/test_seek.html | [started seek.ogv-29] Length of array should match number of running tests
14:12:23     INFO -  4023 INFO TEST-PASS | /tests/content/media/test/test_seek.html | seek.ogv seek test 3: Video currentTime should be around 1.983: 1.983
14:12:23     INFO -  4024 INFO TEST-PASS | /tests/content/media/test/test_seek.html | seek.ogv seek test 4: Video currentTime should be around 1.983: 1.983
14:12:23     INFO -  4025 INFO TEST-PASS | /tests/content/media/test/test_seek.html | seek.ogv seek test 3: Video currentTime should be around 1.983: 1.983
14:12:23     INFO -  4026 INFO TEST-PASS | /tests/content/media/test/test_seek.html | seek.ogv seek test 3: Should have got timeupdate between seeking and seekended
14:12:23     INFO -  4027 INFO TEST-PASS | /tests/content/media/test/test_seek.html | [finished seek.ogv-28] Length of array should match number of running tests
14:12:23     INFO -  4028 INFO TEST-PASS | /tests/content/media/test/test_seek.html | seek.ogv seek test 4: Video currentTime should be around 0.9915: 0.9915
14:12:23     INFO -  4029 INFO TEST-PASS | /tests/content/media/test/test_seek.html | [started seek.ogv-30] Length of array should match number of running tests
14:12:23     INFO -  4030 INFO TEST-PASS | /tests/content/media/test/test_seek.html | seek.ogv seek test 4: seek on target: 0.9915
14:12:23     INFO -  4031 INFO TEST-PASS | /tests/content/media/test/test_seek.html | [finished seek.ogv-29] Length of array should match number of running tests
14:12:23     INFO -  4032 INFO TEST-PASS | /tests/content/media/test/test_seek.html | [started seek.ogv-31] Length of array should match number of running tests
14:12:23     INFO -  4033 INFO TEST-PASS | /tests/content/media/test/test_seek.html | seek.ogv seek test 5: Video currentTime should be around 1.983: 1.983
14:12:23     INFO -  4034 INFO TEST-PASS | /tests/content/media/test/test_seek.html | seek.ogv seek test 6: Seek 1
14:12:23     INFO -  4035 INFO TEST-PASS | /tests/content/media/test/test_seek.html | seek.ogv seek test 6: Seek 2
14:12:24     INFO -  Xlib:  extension "RANDR" missing on display ":0".
14:12:24     INFO -  System JS : ERROR (null):0 - uncaught exception: 2147746065
14:12:24     INFO -  System JS : ERROR (null):0 - uncaught exception: 2147746065
14:12:25     INFO -  4036 INFO TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_seek.html | Test timed out.
14:12:25     INFO -  4037 INFO TEST-INFO | MEMORY STAT vsize after test: 901169152
14:12:25     INFO -  4038 INFO TEST-INFO | MEMORY STAT residentFast after test: 294998016
14:12:25     INFO -  4039 INFO TEST-END | /tests/content/media/test/test_seek.html | finished in 320055ms
14:12:25     INFO -  [Parent 2384] WARNING: Failed to unlock the wakelock.: '!rv.Failed()', file /builds/slave/m-in-lx-d-00000000000000000000/build/content/html/content/src/HTMLVideoElement.cpp, line 301
14:12:25     INFO -  ++DOMWINDOW == 18 (0xee05708) [pid = 2384] [serial = 4796] [outer = 0x99aa2e8]
14:12:25     INFO -  4040 INFO TEST-KNOWN-FAIL | /tests/content/media/test/test_seek.html | Assertion count 1 within expected range 0-5 assertions.
14:12:25     INFO -  4041 INFO TEST-START | /tests/content/media/test/test_seek2.html
Is it different from Bug 857424?
Flags: needinfo?(kwierso)
(In reply to JW Wang[:jwwang] from comment #26)
> Is it different from Bug 857424?

Unsure. It looks slightly different compared with 857424's comment 0 and all of the logs in that bug are expired/deleted.
Flags: needinfo?(kwierso)
(In reply to TBPL Robot from comment #61)
> KWierso
> https://tbpl.mozilla.org/php/getParsedLog.php?id=36840373&tree=Fx-Team
> Ubuntu VM 12.04 x64 fx-team pgo test mochitest-1 on 2014-03-27 16:53:24
> revision: 0df0f2fe3bbf
> slave: tst-linux64-spot-281
> 
> 5036 INFO TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_seek.html |
> Test timed out.


The log contains:

17:21:44     INFO -  5031 INFO TEST-PASS | /tests/content/media/test/test_seek.html | [started gizmo.mp4-116] Length of array should match number of running tests
17:21:44     INFO -  5032 INFO TEST-PASS | /tests/content/media/test/test_seek.html | gizmo.mp4 seek test 12: seeking should default to false
17:21:44     INFO -  5033 INFO TEST-PASS | /tests/content/media/test/test_seek.html | gizmo.mp4 seek test 12: currentTime must report seek target immediately
17:21:44     INFO -  5034 INFO TEST-PASS | /tests/content/media/test/test_seek.html | gizmo.mp4 seek test 12: seeking flag on start should be true
17:21:44     INFO -  5035 INFO TEST-PASS | /tests/content/media/test/test_seek.html | gizmo.mp4 seek test 12: seeking: currentTime must be seekTime
17:21:44     INFO -  Xlib:  extension "RANDR" missing on display ":0".
17:21:45     INFO -  TEST-INFO | screentopng: exit 0
17:21:45     INFO -  5036 INFO TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_seek.html | Test timed out.
17:21:45     INFO -  5037 INFO TEST-INFO | MEMORY STAT vsize after test: 1580392448
17:21:45     INFO -  5038 INFO TEST-INFO | MEMORY STAT residentFast after test: 325599232
17:21:45     INFO -  5039 INFO TEST-INFO | MEMORY STAT heapAllocated after test: 179029896
17:21:45     INFO -  5040 INFO TEST-END | /tests/content/media/test/test_seek.html | finished in 310609ms
17:21:45     INFO -  5041 INFO TEST-START | /tests/content/media/test/test_seek2.html
17:21:45     INFO -  5042 INFO TEST-INFO | MEMORY STAT vsize after test: 1588961280
17:21:45     INFO -  5043 INFO TEST-INFO | MEMORY STAT residentFast after test: 326017024
17:21:45     INFO -  5044 INFO TEST-INFO | MEMORY STAT heapAllocated after test: 179224864
17:21:45     INFO -  5045 INFO TEST-END | /tests/content/media/test/test_seek2.html | finished in 81ms
17:21:45     INFO -  5046 INFO TEST-START | /tests/content/media/test/test_seekLies.html
17:21:45     INFO -  5047 INFO TEST-INFO | MEMORY STAT vsize after test: 1590009856

Specifically:

17:21:44     INFO -  5035 INFO TEST-PASS | /tests/content/media/test/test_seek.html | gizmo.mp4 seek test 12: seeking: currentTime must be seekTime
17:21:44     INFO -  Xlib:  extension "RANDR" missing on display ":0".
17:21:45     INFO -  TEST-INFO | screentopng: exit 0
17:21:45     INFO -  5036 INFO TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_seek.html | Test timed out.

That is saying that the test is producing output at 17:21:44, but times out 1 second later at 17:21:45. Is this test just to long running? What if we split the 13 cases in test_seek up into 13 separate tests?

Jwwang: What do you think?
Flags: needinfo?(jwwang)
17:16:34     INFO -  4933 INFO TEST-START | /tests/content/media/test/test_seek.html <-- test starts!

It could be the test has a really long running time and causes timeout. However, the timestamp printed by SimpleTest.info is not always reliable.

Btw, I think there might be a bug in the seeking code of MediaDecoderStateMachine. I found timeouts in test_bug465498, test_loop, and test_seek which all involves seeking.

I think it will really help to split the test to narrow down the scope to find out the bug.
Flags: needinfo?(jwwang)
This is a try push which increases the timeout for this test by 2x, and the timeouts still happen.  This is probably not due to the length of this test's runtime.

https://tbpl.mozilla.org/?tree=Try&rev=a6717375e338
Assignee: nobody → jwwang
Depends on: 995090
From the logs, it looks like sometimes timer callback with timeout==0 failed to fire on OS X 10.6 Debug and therefore MediaDecoderStateMachine just got stuck. It is more prone to happen especially after seeking. I will test if this strange behavior happens on other platforms as well.
Depends on: 997844
Depends on: 998168
OS: Linux → All
Version: 29 Branch → Trunk
Status: NEW → ASSIGNED
Now the failures mostly happen on Linux. I think it is related to the recent change in our Linux test machines. (https://groups.google.com/forum/#!topic/mozilla.dev.platform/3UeZwbYQTWM)
No longer depends on: 997844
That change happened for mochitest-1 in March of 2013, not in March of 2014. Some single-core related bustage may well have landed in March of this year, but the media tests were only their usual normal crappy selves on single-core AWS slaves for an entire year before they went to hell last month.
https://tbpl.mozilla.org/php/getParsedLog.php?id=38387555&tree=Try&full=1#error3

02:20:23     INFO -  1619 INFO TEST-START | /tests/content/media/test/test_seek.html
02:20:29     INFO -  SEEK-TEST: Started r11025_s16_c1.wav seek test 1
02:20:29     INFO -  SEEK-TEST: Started r11025_s16_c1.wav seek test 2
02:20:31     INFO -  SEEK-TEST: Finished r11025_s16_c1.wav seek test 2
...
02:25:41     INFO -  SEEK-TEST: Finished bug516323.indexed.ogv seek test 11
02:25:44     INFO -  SEEK-TEST: Started bug516323.indexed.ogv seek test 13
02:25:46     INFO -  SEEK-TEST: Finished bug516323.indexed.ogv seek test 13
02:25:48     INFO -  1620 INFO TEST-INFO | dumping last 100 message(s)

This test could take more than 5 mins on B2G to finish and therefore timeout.
jwwang: Looking at comment 760 I see this:

10:19:01     INFO -  SEEK-TEST: Started owl.mp3 seek test 13
10:19:01     INFO -  SEEK-TEST: Finished owl.mp3 seek test 13
10:21:47     INFO -  3992 INFO TEST-INFO | dumping last 100 message(s)
10:21:47     INFO -  3993 INFO TEST-INFO | if you need more context, please use SimpleTest.requestCompleteLog() in your test
10:21:47     INFO -  3994 INFO TEST-PASS | /tests/content/media/test/test_seek.html | detodos.opus seek test 11: Successfully seeked to nonzero
10:21:47     INFO -  3995 INFO TEST-PASS | /tests/content/media/test/test_seek.html | detodos.opus seek test 11: Seek back to zero was successful

Note that owl.mp3 test 13 is the last test in test_seek.html; it appears to finish the tests, but not exit the test.

A green run on osx 10.8 has this:

04:43:35     INFO -  SEEK-TEST: Finished owl.mp3 seek test 12
04:43:36     INFO -  SEEK-TEST: Finished owl.mp3 seek test 13
04:43:37     INFO -  3992 INFO TEST-INFO | MEMORY STAT vsize after test: 3779768320
04:43:37     INFO -  3993 INFO TEST-INFO | MEMORY STAT residentFast after test: 538836992
04:43:37     INFO -  3994 INFO TEST-END | /tests/content/media/test/test_seek.html | finished in 108106ms

The B2G test you listed did finish, but only a couple of seconds before the timeout, so it's unclear if it hung or ran out of time while still working.
Flags: needinfo?(jwwang)
Perhaps we should try increasing the timeout using SimpleTest.requestLongerTimeout?
(In reply to Randell Jesup [:jesup] from comment #762)
> jwwang: Looking at comment 760 I see this:
> 
> 10:19:01     INFO -  SEEK-TEST: Started owl.mp3 seek test 13
> 10:19:01     INFO -  SEEK-TEST: Finished owl.mp3 seek test 13
> 10:21:47     INFO -  3992 INFO TEST-INFO | dumping last 100 message(s)
> 10:21:47     INFO -  3993 INFO TEST-INFO | if you need more context, please
> use SimpleTest.requestCompleteLog() in your test
> 10:21:47     INFO -  3994 INFO TEST-PASS |
> /tests/content/media/test/test_seek.html | detodos.opus seek test 11:
> Successfully seeked to nonzero
> 10:21:47     INFO -  3995 INFO TEST-PASS |
> /tests/content/media/test/test_seek.html | detodos.opus seek test 11: Seek
> back to zero was successful
> 
> Note that owl.mp3 test 13 is the last test in test_seek.html; it appears to
> finish the tests, but not exit the test.


10:16:29     INFO -  SEEK-TEST: Started r11025_s16_c1.wav seek test 1
10:16:29     INFO -  SEEK-TEST: Started r11025_s16_c1.wav seek test 2
10:16:31     INFO -  SEEK-TEST: Finished r11025_s16_c1.wav seek test 2
10:16:32     INFO -  SEEK-TEST: Started r11025_s16_c1.wav seek test 3
10:16:32     INFO -  SEEK-TEST: Finished r11025_s16_c1.wav seek test 1
10:16:32     INFO -  SEEK-TEST: Started r11025_s16_c1.wav seek test 4
10:16:32     INFO -  SEEK-TEST: Finished r11025_s16_c1.wav seek test 3
10:16:33     INFO -  SEEK-TEST: Started r11025_s16_c1.wav seek test 5
10:16:33     INFO -  SEEK-TEST: Finished r11025_s16_c1.wav seek test 4
10:16:34     INFO -  SEEK-TEST: Started r11025_s16_c1.wav seek test 6
10:16:34     INFO -  SEEK-TEST: Finished r11025_s16_c1.wav seek test 6
10:16:35     INFO -  SEEK-TEST: Started r11025_s16_c1.wav seek test 7
10:16:35     INFO -  SEEK-TEST: Finished r11025_s16_c1.wav seek test 7
10:16:35     INFO -  SEEK-TEST: Started r11025_s16_c1.wav seek test 8
10:16:35     INFO -  SEEK-TEST: Finished r11025_s16_c1.wav seek test 8
10:16:36     INFO -  SEEK-TEST: Started r11025_s16_c1.wav seek test 9
10:16:36     INFO -  SEEK-TEST: Finished r11025_s16_c1.wav seek test 9
10:16:37     INFO -  SEEK-TEST: Started r11025_s16_c1.wav seek test 10
10:16:37     INFO -  SEEK-TEST: Finished r11025_s16_c1.wav seek test 10
10:16:38     INFO -  SEEK-TEST: Started r11025_s16_c1.wav seek test 11
10:16:38     INFO -  SEEK-TEST: Finished r11025_s16_c1.wav seek test 11
10:16:38     INFO -  SEEK-TEST: Started r11025_s16_c1.wav seek test 12
10:16:38     INFO -  SEEK-TEST: Finished r11025_s16_c1.wav seek test 12
10:16:39     INFO -  SEEK-TEST: Started r11025_s16_c1.wav seek test 13
10:16:39     INFO -  SEEK-TEST: Finished r11025_s16_c1.wav seek test 13

It is because r11025_s16_c1.wav seek test 5 didn't finish. Since PARALLEL_TESTS is 2, we can run at most 2 concurrent tests at the same time. The test case can still go on one by one while test 5 is stuck.

It appears that we have multiple causes for the timeout...
Flags: needinfo?(jwwang)
Attached patch part1_fix.patch — — Splinter Review
(1) enable test on windows.
(2) longer timeout for it could take over 10 mins on B2G.
(3) release decoder to avoid timeout on waiting for decoding resource.

try: https://tbpl.mozilla.org/?tree=Try&rev=4af929401f9e
linux: 454 runs, no test_seek.html timeouts
windows: 235 runs, no test_seek.html timeouts
android: 176 runs, 2 test_seek.html timeouts on Android 2.2 Opt

This patch seems to stop timeouts from Linux platforms. I will continue investigating the timeouts on B2G and Android.
Attachment #8412346 - Flags: review?(cpearce)
https://tbpl.mozilla.org/php/getParsedLog.php?id=38420010&tree=Try&full=1#error2

10:21:55     INFO -  1564 INFO TEST-START | /tests/content/media/test/test_seek.html
10:22:02     INFO -  SEEK-TEST: Started r11025_s16_c1.wav seek test 1
10:22:02     INFO -  SEEK-TEST: Started r11025_s16_c1.wav seek test 2
10:22:03     INFO -  SEEK-TEST: Finished r11025_s16_c1.wav seek test 2
...
10:31:45     INFO -  SEEK-TEST: Started owl.mp3 seek test 9
10:31:46     INFO -  SEEK-TEST: Finished owl.mp3 seek test 8
10:31:49     INFO -  SEEK-TEST: Started owl.mp3 seek test 10
10:31:50     INFO -  SEEK-TEST: Finished owl.mp3 seek test 9
10:31:53     INFO -  SEEK-TEST: Started owl.mp3 seek test 11
10:31:54     INFO -  SEEK-TEST: Finished owl.mp3 seek test 10

It ran almost 10 mins on B2G. I will investigate if it is because B2G emulator is slow in playing some kinds of media files such as .h264 or it is from some abnormal long-running tasks like this (https://bugzilla.mozilla.org/show_bug.cgi?id=989047#c107).

I was thinking about a tool which can monitor/watch long-running tasks and dump the stack trace to let us know where the thread gets stuck...
Attachment #8412346 - Flags: review?(cpearce) → review+
Keywords: checkin-needed
Whiteboard: [leave open]
Is anything going to happen with this test?  The timeout frequency is really high.
comment 808 timed out for the fixes are not applied.
Most others timed out for crash...
This failure seems to be more frequent with HTTP cache v2 on.  Who should I ask how to get a fix here?  I strongly believe it's not a bug in the new cache but in the test.  Is it possible to rather disable this test?
Flags: needinfo?(jwwang)
There are certainly bugs in the media code, but I remain to be convinced that test_seek.html fails with the new HTTP cache solely due to bugs in the media code. I accept that it's possible however.  ;)

I don't want to disable test_seek, it tests important functionality that we don't want to regress. Turning on the new HTTP cache is also important. jwwang, can you please work with Honza to help figure out why test_seek fails when the new HTTP cache is turned on? It would be much appreciated.

It would be good if someone can make the change suggested by Honza in bug 977314 comment 8 too.
According https://tbpl.mozilla.org/?tree=Gum&rev=b790747b7b08 M(1) on Windows something needs to happen here :(
Blocks: cache2tests
(In reply to Honza Bambas (:mayhemer) from comment #832)
> This failure seems to be more frequent with HTTP cache v2 on.  Who should I
> ask how to get a fix here?  I strongly believe it's not a bug in the new
> cache but in the test.  Is it possible to rather disable this test?

Where can I download the branch with HTTP cache v2 on? I want to add some logs and have a try on the server.
Flags: needinfo?(jwwang) → needinfo?(honzab.moz)
(In reply to JW Wang [:jwwang] from comment #847)
> (In reply to Honza Bambas (:mayhemer) from comment #832)
> > This failure seems to be more frequent with HTTP cache v2 on.  Who should I
> > ask how to get a fix here?  I strongly believe it's not a bug in the new
> > cache but in the test.  Is it possible to rather disable this test?
> 
> Where can I download the branch with HTTP cache v2 on? I want to add some
> logs and have a try on the server.

Just build m-c and switch browser.cache.use_new_backend pref to 1.

Thanks for help!
Flags: needinfo?(honzab.moz)
Part of the reason I'm hesitant to believe that the media code is entirely to blame is that in debug builds on Windows test_seek.html can sometimes take 45 seconds to load gizmo.mp4 from httpd.js. This is a 445KB file. I think this is partially causing our timeouts in this test.

I don't know what's going on here, there could be a bug in necko, media, or httpd.js, or somewhere else. I don't have cycles to look into it, but it's a cause for concern when it takes so long to download such a small file from a local webserver. opt builds seem fine however.
Chris, is the file being loaded from the server and not from the cache?  Do we send out the request and not get an answer?  Any more data or ref to the logs so I can take a look?


Also, bug 1005475 could potentially (tho unlikely) help here (I need to merge gum to include that fix in it and see how the tree looks, tomorrow).
(In reply to Honza Bambas (:mayhemer) from comment #850)
> Chris, is the file being loaded from the server and not from the cache?

How do I tell? I'm necko n00b, unfortunately.

>  Do
> we send out the request and not get an answer?  Any more data or ref to the
> logs so I can take a look?

I can do a run with Necko NSPR logging enabled if you like. What NSPR_LOG_MODULES would be useful?
(In reply to Chris Pearce (:cpearce) from comment #851)
> (In reply to Honza Bambas (:mayhemer) from comment #850)
> > Chris, is the file being loaded from the server and not from the cache?
> 
> How do I tell? I'm necko n00b, unfortunately.

Didn't know.

> 
> >  Do
> > we send out the request and not get an answer?  Any more data or ref to the
> > logs so I can take a look?
> 
> I can do a run with Necko NSPR logging enabled if you like. What
> NSPR_LOG_MODULES would be useful?

timestamp,nsHttp:4,cache2:5 would be the magic.  Thanks!
When browser.cache.use_new_backend is 1, sometimes OnStartRequest is not fired (ChannelMediaResource::Listener::OnStartRequest) and the media decoder reader get stuck in MediaCacheStream::Read for there is no incoming data from the channel.

However, when browser.cache.use_new_backend is 0, OnStartRequest is always fired after calling ChannelMediaResource::RecreateChannel and ChannelMediaResource::OpenChannel.

I checked every return value from nsIHttpChannel/nsIChannel functions and found no error. Should I check nsHttp logs to find out why OnStartRequest not fired?
Flags: needinfo?(honzab.moz)
(In reply to JW Wang [:jwwang] from comment #854)
> When browser.cache.use_new_backend is 1, sometimes OnStartRequest is not
> fired (ChannelMediaResource::Listener::OnStartRequest) and the media decoder
> reader get stuck in MediaCacheStream::Read for there is no incoming data
> from the channel.

That's an interesting find!  Is OnDataAvailable and OnStopRequest called there after?  I assume not, just want to make sure.

Note that OnStartRequest must always be called by the channel, so here it seems like we do something wrong.

> 
> However, when browser.cache.use_new_backend is 0, OnStartRequest is always
> fired after calling ChannelMediaResource::RecreateChannel and
> ChannelMediaResource::OpenChannel.
> 
> I checked every return value from nsIHttpChannel/nsIChannel functions and
> found no error. Should I check nsHttp logs to find out why OnStartRequest
> not fired?

I think the cache entry opening gets stuck here.  So, please create a log with NSPR_LOG_MODULES="timestamp,nsHttp:4,cache2:5" + any of your media logs you can add to0, send it to me and - the most important - *point me to the ChannelMediaResource::Listener in that log* that did not get the proper OnStartRequest call.  I can then search down to the cache code and see what misbehaves.

Thanks!
Flags: needinfo?(honzab.moz)
Attached file test_seek.log.tar.gz —
The log file I got when running test_seek.html on my machine.
command=|./mach mochitest-plain content/media/test/test_seek.html --repeat 10000|

Please search for the lines:
|OpenChannel, aStreamListener=? decoder=??? stream=???|
|OpenChannel, mListener=???|
which are called inside ChannelMediaResource::OpenChannel which will open a channel.

And the line:
|ChannelMediaResource::Listener::OnStartRequest=???|
which is called inside ChannelMediaResource::Listener::OnStartRequest.

And the line:
|OnStartRequest, decoder=??? stream=???|
which is called inside ChannelMediaResource::OnStartRequest.

Go to line 26742:
|OpenChannel, aStreamListener=0 decoder=7ff4bee35060 stream=7ff49a0558b8|
|OpenChannel, mListener=7ff49a2f9f70|
You will see a channel is opened for stream=7ff49a0558b8, listener=7ff49a2f9f70. However ChannelMediaResource::Listener::OnStartRequest is not called for the listener=7ff49a2f9f70.

When it got stuck in MediaCacheStream::Read, I deliberately called MOZ_ASSERT(false) to make the problem obvious to catch.

ps: refer to https://hg.mozilla.org/try/diff/f57576e7a866/content/media/MediaResource.cpp and  https://hg.mozilla.org/try/diff/f57576e7a866/content/media/MediaCache.cpp for logs I added.
See Also: → 1009500
JW, thanks!  This is a perfect example of an inter module cooperation.  It took 4 minutes to find the cause for me thanks your fantastic pointing to right place in the log.

It's a cache2 bug this time (bug 1009500)

Only unfortunate thing is that this will not fix this bug :(  only the cache2 spike.
Cache 2 spikes fixed with bug 1009500.  Unblocking.
No longer blocks: cache2tests
Disabled test_seek.html on all platforms for being a persistent source of intermittent test failures in multiple bugs:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a95eb3950035
Keywords: leave-open
Whiteboard: [leave open] → [test disabled]
Disabling test_seek is unacceptable. It tests functionality that is too important for us to continue working on the overwhelming backlog of features without test coverage.

It looks like test_seek is mostly failing on b2g and android, so I am going to re-enable on all platforms except b2g and android.
Re-enabled test_seek on platforms other than android and b2g. It's mostly orange on android and b2g.

https://hg.mozilla.org/integration/mozilla-inbound/rev/2937a989dbf8
Agreed with Chris. test_seek is important, and it did catch bugs other than the media-related. (see comment 861)

There is a bug about nsITimer (bug 1033121) which could cause timeouts in all media tests on all platforms. However Android seems to hit the bug more often.

I think the bug will be fixed soon, and then we can see if there is a significant decrease in timeout rate on Android.
(In reply to Chris Pearce (:cpearce) from comment #901)
> Disabling test_seek is unacceptable. It tests functionality that is too
> important for us to continue working on the overwhelming backlog of features
> without test coverage.
> 
> It looks like test_seek is mostly failing on b2g and android, so I am going
> to re-enable on all platforms except b2g and android.

I disabled on other platforms since there are !B2G/Android failures in bug 981153, however they are fewer in number, but I can deal with that as a compromise.

Please can we take this as a sign that the ratio of time spent on new features to time spent on fixing the ones that were not reliable (and/or the unreliable tests for them) needs to be moved more towards the latter? Otherwise we're just going to be having this same conversation in another few dozen media test failure bugs yet again in the future...
Anthony mentioned that splitting this test might help things. I opened bug 1036992 for that.
(In reply to Chris Pearce (:cpearce) from comment #849)
> Part of the reason I'm hesitant to believe that the media code is entirely
> to blame is that in debug builds on Windows test_seek.html can sometimes
> take 45 seconds to load gizmo.mp4 from httpd.js. This is a 445KB file. I
> think this is partially causing our timeouts in this test.

I filed bug 1037172 to follow up on this.
test_seek.html has been split into test_seek-*.html. We can close this bug now.
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Keywords: leave-open
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.