Closed
Bug 651023
Opened 13 years ago
Closed 13 years ago
Intermittent Linux & Linux64 test_playback.html | application timed out after 330 seconds with no output
Categories
(Core :: Audio/Video, defect)
Tracking
()
RESOLVED
FIXED
mozilla6
People
(Reporter: philor, Assigned: cpearce)
References
Details
(Keywords: intermittent-failure)
Attachments
(1 file, 1 obsolete file)
2.39 KB,
patch
|
kinetik
:
review+
|
Details | Diff | Splinter Review |
Looks like the first one actually found its way to bug 620326, http://tinderbox.mozilla.org/showlog.cgi?log=Devtools/1302883693.1302884633.21846.gz Rev3 Fedora 12x64 devtools opt test mochitests-1/5 on 2011/04/15 09:08:13 s: talos-r3-fed64-049 TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_playback.html | application timed out after 330 seconds with no output PROCESS-CRASH | /tests/content/media/test/test_playback.html | application crashed (minidump found) Thread 0 (crashed) followed by http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1303155389.1303158376.6513.gz Rev3 Fedora 12x64 mozilla-central debug test mochitests-1/5 on 2011/04/18 12:36:29 s: talos-r3-fed64-013 63558 INFO TEST-PASS | /tests/content/media/test/test_playback.html | bug482461-theora.ogv time should run forwards: p=0.000264 c=0.26722 TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_playback.html | application timed out after 330 seconds with no output and http://tinderbox.mozilla.org/showlog.cgi?log=Cedar/1303178059.1303179058.6211.gz Rev3 Fedora 12x64 cedar opt test mochitests-1/5 on 2011/04/18 18:54:19 s: talos-r3-fed64-049 63561 INFO TEST-PASS | /tests/content/media/test/test_playback.html | bug500311.ogv time should run forwards: p=1.37525 c=1.667833 TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_playback.html | application timed out after 330 seconds with no output
Comment 1•13 years ago
|
||
There were content/media changes landed around the time of the first timeout. The main change would be bug 641718. Only the first of the failures has a stack track, but I can't see anything obvious there, unless the threads are livelocked in an UpdateReadyState/PlaybackPositionChanged dance. The AudioLoop is asleep in Wait(), another thread is dispatch an event to the main thread from AdvanceFrame/UpdateReadyState, and another is dispatching an event to the main thread from nsBuiltinDecoderStateMachine::Run. The main thread is waiting on the decoder monitor in PlaybackPositionChanged.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 5•13 years ago
|
||
(In reply to comment #1) > Only the first of the failures has a stack track, but I can't see anything > obvious there, unless the threads are livelocked in an > UpdateReadyState/PlaybackPositionChanged dance. Caught this in a debugger and it looks like this is what's happening. AdvanceFrame's remainingTime calculation is negative (because now > frame->mTime) and clock_time from the audio stream is not updating. This results in AdvanceFrame returning without sleeping then being immediately called again. Each time, it calls UpdatePlaybackPosition which causes the main thread to call PlaybackPositionChanged.
Comment 6•13 years ago
|
||
Lost my debugging session before I solved this and haven't been able to reproduce it again yet. But I think the problem is: For 44.1kHz audio, there are floor(22.676) usecs per sample. This means clock_time has a granularity of no higher than 22us (one sample). I think this is a situation where the difference between the next frame time and the clock time is less than 22us, so we wait in a tight loop.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 12•13 years ago
|
||
Comment 7, 9, 10, and 11 are on 32-bit Linux. Updating bug summary.
Summary: Intermittent Linux64 test_playback.html | application timed out after 330 seconds with no output → Intermittent Linux & Linux64 test_playback.html | application timed out after 330 seconds with no output
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 23•13 years ago
|
||
(In reply to comment #6) > For 44.1kHz audio, there are floor(22.676) usecs per sample. This means > clock_time has a granularity of no higher than 22us (one sample). I think this > is a situation where the difference between the next frame time and the clock > time is less than 22us, so we wait in a tight loop. I've caught this in the debugger as well, and this seems to be what is happening. The audio thread hasn't push more the mAudioStream->GetMinWriteSamples() worth of audio to the hardware, and it's also blocked waiting to acquire the decoder monitor, so the audio clock isn't advancing. I would have expected the ReentrantMonitorAutoEnter in the Run() loop to drop the monitor to allow the other threads to take the monitor... Perhaps in AdvanceFrame() we should ensure remainingTime is always at least 1ms? This means that video frames could be late by almost 1ms during playback. Or we could round clock_time down to the nearest ms in the "Skip frames up to the frame at playback postion" loop, so that we present frames slightly early, and so wait longer. Due to the delay in our rendering pipeline, I doubt this would be noticeable.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 27•13 years ago
|
||
Try what I suggested in comment 23. Seems to work. This means frames may be slightly late(r), but I doubt it will be noticeable given the existing delays in our rendering pipeline. I ran test_playback for several hours on my linux box, and could no longer trigger the intermittent failure. Greenish on Try: http://tbpl.mozilla.org/?tree=Try&rev=54d99e77149e
Comment 28•13 years ago
|
||
Comment on attachment 531773 [details] [diff] [review] Patch: Round up wait times to 1ms to ensure we drop the decoder monitor when presenting video frames. It might be better to perform this round up in Wait() itself.
Assignee | ||
Comment 29•13 years ago
|
||
(In reply to comment #28) > It might be better to perform this round up in Wait() itself. Probably a good idea. Looks like it's possible we could hit this same problem in the Wait() call while draining at the end of AudioLoop().
Attachment #531773 -
Attachment is obsolete: true
Attachment #531773 -
Flags: review?(kinetik)
Attachment #531818 -
Flags: review?(kinetik)
Updated•13 years ago
|
Attachment #531818 -
Flags: review?(kinetik) → review+
Assignee | ||
Comment 30•13 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/d15d6f06eeb6
Assignee | ||
Comment 31•13 years ago
|
||
I declare victory!
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•13 years ago
|
Target Milestone: --- → mozilla6
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Updated•12 years ago
|
Keywords: intermittent-failure
Updated•12 years ago
|
Whiteboard: [orange]
You need to log in
before you can comment on or make changes to this bug.
Description
•