786539 - Intermittent test_timeupdate_small_files.html | Test timed out

Reporter

Description

•

13 years ago

+++ This bug was initially created as a clone of Bug #687972 +++ https://tbpl.mozilla.org/php/getParsedLog.php?id=14792926&tree=Firefox Rev3 WINNT 6.1 mozilla-central opt test mochitests-1/5 on 2012-08-28 18:53:57 PDT for push 0b4bc3760d51 slave: talos-r3-w7-024 130056 INFO TEST-INFO | /tests/content/media/test/test_timeupdate_small_files.html | http://mochi.test:8888/tests/content/media/test/detodos.opus got timeupdate nsBuiltinDecoderStateMachine::RunStateMachine queuing nsBuiltinDecoder::PlaybackEnded 130057 INFO TEST-PASS | /tests/content/media/test/test_timeupdate_small_files.html | detodos.opus - shouldn't get timeupdate after ended 130058 INFO TEST-INFO | /tests/content/media/test/test_timeupdate_small_files.html | http://mochi.test:8888/tests/content/media/test/detodos.opus got timeupdate 130059 INFO TEST-INFO | /tests/content/media/test/test_timeupdate_small_files.html | http://mochi.test:8888/tests/content/media/test/detodos.opus got pause 130060 INFO TEST-PASS | /tests/content/media/test/test_timeupdate_small_files.html | detodos.opus - shouldn't get timeupdate after ended 130061 INFO TEST-INFO | /tests/content/media/test/test_timeupdate_small_files.html | http://mochi.test:8888/tests/content/media/test/detodos.opus got timeupdate 130062 INFO TEST-PASS | /tests/content/media/test/test_timeupdate_small_files.html | detodos.opus - should see at least one timeupdate: 2.9135 130063 INFO TEST-INFO | /tests/content/media/test/test_timeupdate_small_files.html | http://mochi.test:8888/tests/content/media/test/detodos.opus got ended 130064 INFO TEST-PASS | /tests/content/media/test/test_timeupdate_small_files.html | [finished detodos.opus-4] Length of array should match number of running tests - 1 should equal 1 130065 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_timeupdate_small_files.html | Test timed out. (screenshot) 130066 INFO TEST-END | /tests/content/media/test/test_timeupdate_small_files.html | finished in 304592ms 130067 INFO TEST-START | /tests/content/media/test/test_too_many_elements.html nsBuiltinDecoderStateMachine::RunStateMachine queuing nsBuiltinDecoder::PlaybackEnded

Ed Morley [:emorley]

Updated

•

13 years ago

Blocks: 761917

Matthew Gregan [:kinetik]

Comment 1

•

13 years ago

Bug 761917 was Linux only, this is a different failure. There are probably a bunch of bugs like this; the existing bugs were a mess of different similar looking failures. I closed all of them ones that bug 761917 fixed a Linux variant of the failure on, so now we'll get new bugs for each of the existing failures masked previously masked by that problem.

No longer blocks: 761917

Comment hidden (Legacy TBPL/Treeherder Robot)

Ed Morley [:emorley]

Comment 307

•

13 years ago

Toporange with several hundred failures not telling us anything new. Disabled for now on Windows, as part of the drive to reduce our OrangeFactor: https://hg.mozilla.org/integration/mozilla-inbound/rev/47ae2d6ba237

Whiteboard: [orange] → [orange][test disabled on Windows][leave open]

Comment hidden (Legacy TBPL/Treeherder Robot)

Ryan VanderMeulen [:RyanVM]

Comment 311

•

13 years ago

https://hg.mozilla.org/mozilla-central/rev/47ae2d6ba237

Comment hidden (Legacy TBPL/Treeherder Robot)

Nobody; OK to take it and work on it

Updated

•

13 years ago

Keywords: intermittent-failure

Comment hidden (Legacy TBPL/Treeherder Robot)

Nobody; OK to take it and work on it

Updated

•

13 years ago

Whiteboard: [orange][test disabled on Windows][leave open] → [test disabled on Windows][leave open]

Comment hidden (Legacy TBPL/Treeherder Robot)

Ed Morley [:emorley]

Comment 339

•

13 years ago

Disabled on beta: https://hg.mozilla.org/releases/mozilla-beta/rev/616c8eef395e

status-firefox18: --- → disabled

status-firefox19: --- → disabled

Comment hidden (Legacy TBPL/Treeherder Robot)

gcp https://tbpl.mozilla.org/php/getParsedLog.php?id=21867697&tree=Mozilla-Inbound Android Tegra 250 mozilla-inbound opt test mochitest-2 on 2013-04-16 08:34:55 slave: tegra-039 27524 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_timeupdate_small_files.html | Test timed out. 27677 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/webaudio/test/test_currentTime.html | AudioContext.currentTime should have increased by now 27866 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/webspeech/recognition/test/test_abort.html | unexpected event: error -- No speech detected (timeout) 27871 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/webspeech/recognition/test/test_abort.html | unexpected event: error -- No speech detected (timeout) 27876 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/webspeech/recognition/test/test_abort.html | unexpected event: error -- No speech detected (timeout) 27881 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/webspeech/recognition/test/test_abort.html | unexpected event: error -- No speech detected (timeout) 27886 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/webspeech/recognition/test/test_abort.html | unexpected event: error -- No speech detected (timeout) 27901 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/webspeech/recognition/test/test_call_start_from_end_handler.html | unexpected event: error -- No speech detected (timeout) 27916 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/webspeech/recognition/test/test_recognition_service_error.html | expect correct error code - got 0, expected 3 27917 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/webspeech/recognition/test/test_recognition_service_error.html | Test timed out. 27924 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/webspeech/recognition/test/test_success_without_recognition_service.html | unexpected event: error -- No speech detected (timeout) 27932 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/webspeech/recognition/test/test_timeout.html | Test timed out. 28099 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/webspeech/synth/test/test_speech_simple.html | Test timed out. 28100 ERROR TEST-UNEXPECTED-FAIL | (SimpleTest/TestRunner.js) | 4 test timeouts, giving up. 28101 ERROR TEST-UNEXPECTED-FAIL | (SimpleTest/TestRunner.js) | Skipping 59 remaining tests. 28102 ERROR TEST-UNEXPECTED-FAIL | (SimpleTest/TestRunner.js) | /tests/content/media/webspeech/synth/test/test_speech_simple.html finished in a non-clean fashion, probably because it didn't call SimpleTest.finish() TEST-UNEXPECTED-FAIL | Shutdown | application ran for longer than allowed maximum time

Comment hidden (Legacy TBPL/Treeherder Robot)

Ryan VanderMeulen [:RyanVM]

Comment 373

•

12 years ago

Any idea why this has spiked so significantly on OSX over the last day?

Flags: needinfo?(paul)

Flags: needinfo?(cpearce)

Comment hidden (Legacy TBPL/Treeherder Robot)

Ryan VanderMeulen [:RyanVM]

Comment 375

•

12 years ago

Paul, I'm wondering if bug 861936 is to blame for the spike.

Comment hidden (Legacy TBPL/Treeherder Robot)

Chris Pearce [:cpearce (Not reading bugmail)]

Comment 386

•

12 years ago

Paul's suggestion seems the most likely.

Flags: needinfo?(cpearce)

Comment hidden (Legacy TBPL/Treeherder Robot)

Ryan VanderMeulen [:RyanVM]

Comment 395

•

12 years ago

Going back on inbound, bug 834835 appears to be responsible for the recent OSX spike.

Comment hidden (Legacy TBPL/Treeherder Robot)

Ryan VanderMeulen [:RyanVM]

Comment 405

•

12 years ago

(In reply to Ryan VanderMeulen [:RyanVM] from comment #395) > Going back on inbound, bug 834835 appears to be responsible for the recent > OSX spike. So I had a backout for this in my patch queue which I inadvertently pushed along with some other patches. No failures on OSX since (even with numerous retriggers).

Comment hidden (Legacy TBPL/Treeherder Robot)

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 407

•

12 years ago

cpearce.

Assignee: nobody → cpearce

Randell Jesup [:jesup] (needinfo me)

Updated

•

12 years ago

Blocks: 834835

Chris Pearce [:cpearce (Not reading bugmail)]

Comment 408

•

12 years ago

Attached patch Patch v1: Fix (obsolete) — Details — Splinter Review

I have been able to reproduce this test reliably locally. The problem seems to be that we're firing "canplay" but not autoplaying under some conditions. test_timeupdate relies on autoplay working. I think the problem is that HTMLMediaElement::CanActivateAutoplay() is returning false because the resource's download is "suspended", and so we never reach HAVE_ENOUGH_DATA readyState and so never start playing. I think this can happen when the decoder/resource is cloned. This patch also re-enables the test on Windows. Looks greenish, modulo known orange: https://tbpl.mozilla.org/?tree=Try&pusher=cpearce@mozilla.com

Attachment #740522 - Flags: review?(roc)

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 409

•

12 years ago

A cloned resource should reach HAVE_ENOUGH_DATA just like any other resource ... cloning should not affect the observable behavior.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 410

•

12 years ago

My debugging seems to show that HTMLMediaElement::UpdateReadyStateForData with a cloned decoder is going wrong. We get to if (stats.mTotalBytes < 0 ? stats.mDownloadRateReliable : stats.mTotalBytes == stats.mDownloadPosition || mDecoder->CanPlayThrough()) which stays false. stats.mTotalBytes is correct, but stats.mDownloadPosition is short of mTotalBytes. Also, mDecoder->CanPlayThrough() returns false.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 411

•

12 years ago

More information about what's going on: 1) An HTTP resource is opened and we read from it for a while. 2) For some reason (cache pressure probably) we suspend reading from the channel temporarily. 3) Then we resume reading. This causes MediaResource::mReopenOnError to be set, since the server might have timed out our connection while we were suspended and we want to silently reopen in that case. 4) We reach the end of the stream and get OnStopRequest from the HTTP stack. 5) This comment kicks in: // Note that aStatus might have succeeded --- this might be a normal close // --- even in situations where the server cut us off because we were // suspended. So we need to "reopen on error" in that case too. and we seek to the end of the stream and try to read some more, instead of sending NotifyDataEnded as you'd expect. 6) Meanwhile, we've cloned the resource for another media element. 7) Some of the underlying data has been evicted from the cache. We read some data for the new element, but then the cache is full, so we suspend reading. 8) We call MediaCacheStream::AreAllStreamsForResourceSuspended to see if we should just fire autoplay now to avoid the "never autoplay because there isn't room in the cache" problem. 9) That method sees that the original resource is still open and able to read from its HTTP channel, so it returns false. 10) We're deadlocked. One possible fix here is to make AreAllStreamsForResourceSuspended a bit smarter so streams that are reading ahead of the data we need are ignored ... after all, they're not going to provide the data we need. This does fix the bug. However, I also want to look into what happens to the seek triggered in step 5. It seems to me that seek should eventually result in another OnStopRequest, this time with mReopenOnError false, but it doesn't seem to.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 412

•

12 years ago

Attached patch Part 1: When receiving an HTTP 416 in response to a seek, tell the media cache that we reached the end of the resource — Details — Splinter Review

Assignee: cpearce → roc

Attachment #740522 - Attachment is obsolete: true

Attachment #740522 - Flags: review?(roc)

Attachment #741164 - Flags: review?(cpearce)

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 413

•

12 years ago

Attached patch Part 2: AreAllStreamsForResourceSuspended should ignore streams that can't read the resource data we need — Details — Splinter Review

Attachment #741165 - Flags: review?(cpearce)

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 414

•

12 years ago

Attached patch Part 3: Reenable test on Windows — Details — Splinter Review

Attachment #741166 - Flags: review?(cpearce)

Chris Pearce [:cpearce (Not reading bugmail)]

Updated

•

12 years ago

Attachment #741164 - Flags: review?(cpearce) → review+

Chris Pearce [:cpearce (Not reading bugmail)]

Comment 415

•

12 years ago

Comment on attachment 741165 [details] [diff] [review] Part 2: AreAllStreamsForResourceSuspended should ignore streams that can't read the resource data we need Review of attachment 741165 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/media/MediaCache.cpp @@ +1829,5 @@ > return mIsTransportSeekable; > } > > bool > MediaCacheStream::AreAllStreamsForResourceSuspended(MediaResource** aActiveStream) AFAICT aActiveStream isn't used by any of the caller of this function, so we should remove aActiveStream sometime.

Attachment #741165 - Flags: review?(cpearce) → review+

Chris Pearce [:cpearce (Not reading bugmail)]

Updated

•

12 years ago

Attachment #741166 - Flags: review?(cpearce) → review+

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 416

•

12 years ago

https://tbpl.mozilla.org/?tree=Try&rev=8d015f7a4dec

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 417

•

12 years ago

Tryserver showed an intermittent failure in test_timeupdate_small_files on Windows XP in 1 run out of 10, but it's not a test timeout. https://hg.mozilla.org/integration/mozilla-inbound/rev/8668a6b76ef2 https://hg.mozilla.org/integration/mozilla-inbound/rev/3d8b98f8a48c https://hg.mozilla.org/integration/mozilla-inbound/rev/756845df5849

Flags: needinfo?(paul)

Whiteboard: [test disabled on Windows][leave open]

Ryan VanderMeulen [:RyanVM]

Comment 418

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/8668a6b76ef2 https://hg.mozilla.org/mozilla-central/rev/3d8b98f8a48c https://hg.mozilla.org/mozilla-central/rev/756845df5849

Status: NEW → RESOLVED

Closed: 12 years ago

Flags: in-testsuite+

Resolution: --- → FIXED

Target Milestone: --- → mozilla23

Randell Jesup [:jesup] (needinfo me)

Comment 419

•

12 years ago

Comment on attachment 741164 [details] [diff] [review] Part 1: When receiving an HTTP 416 in response to a seek, tell the media cache that we reached the end of the resource cpearce, roc correct any of this if it's wrong, thanks! [Approval Request Comment] Bug caused by (feature/regressing bug #): don't know User impact if declined: intermittent problems with small files (which was why it triggered oranges). Blocks bug 834835 from landing in Aurora, which is needed for WebRTC. Testing completed (on m-c, etc.): On m-c and tests (re)enabled Risk to taking this patch (and alternatives if risky): very low risk String or IDL/UUID changes made by this patch: none Will need to land with the other two parts

Attachment #741164 - Flags: approval-mozilla-aurora?

Randell Jesup [:jesup] (needinfo me)

Comment 420

•

12 years ago

Comment on attachment 741165 [details] [diff] [review] Part 2: AreAllStreamsForResourceSuspended should ignore streams that can't read the resource data we need [Approval Request Comment] See patch 1 in this bug

Attachment #741165 - Flags: approval-mozilla-aurora?

Randell Jesup [:jesup] (needinfo me)

Comment 421

•

12 years ago

Comment on attachment 741166 [details] [diff] [review] Part 3: Reenable test on Windows [Approval Request Comment] See patch 1 in this bug

Attachment #741166 - Flags: approval-mozilla-aurora?

Alex Keybl [:akeybl]

Comment 422

•

12 years ago

Approving in support of a tracked bug.

status-firefox21: --- → disabled

status-firefox22: --- → affected

tracking-firefox22: --- → +

Alex Keybl [:akeybl]

Updated

•

12 years ago

Attachment #741164 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+

Alex Keybl [:akeybl]

Updated

•

12 years ago

Attachment #741165 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+

Alex Keybl [:akeybl]

Updated

•

12 years ago

Attachment #741166 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+

Randell Jesup [:jesup] (needinfo me)

Comment 423

•

12 years ago

https://hg.mozilla.org/releases/mozilla-aurora/rev/8cb6596a95a1 https://hg.mozilla.org/releases/mozilla-aurora/rev/193c3cfc4bcb https://hg.mozilla.org/releases/mozilla-aurora/rev/144a21215519

Patch v1: Fix 12 years ago Chris Pearce [:cpearce (Not reading bugmail)] 3.24 KB, patch		Details \| Diff \| Splinter Review
Part 1: When receiving an HTTP 416 in response to a seek, tell the media cache that we reached the end of the resource 12 years ago Robert O'Callahan (:roc) (email my personal email if necessary) 1.65 KB, patch	cpearce : review+ akeybl : approval-mozilla-aurora+	Details \| Diff \| Splinter Review
Part 2: AreAllStreamsForResourceSuspended should ignore streams that can't read the resource data we need 12 years ago Robert O'Callahan (:roc) (email my personal email if necessary) 3.97 KB, patch	cpearce : review+ akeybl : approval-mozilla-aurora+	Details \| Diff \| Splinter Review
Part 3: Reenable test on Windows 12 years ago Robert O'Callahan (:roc) (email my personal email if necessary) 1.36 KB, patch	cpearce : review+ akeybl : approval-mozilla-aurora+	Details \| Diff \| Splinter Review