Closed
Bug 1403307
Opened 7 years ago
Closed 6 years ago
Intermittent dom/media/test/test_mediarecorder_pause_resume_video.html | Last frame should be green
Categories
(Core :: Audio/Video: Recording, defect, P2)
Core
Audio/Video: Recording
Tracking
()
RESOLVED
FIXED
mozilla59
Tracking | Status | |
---|---|---|
firefox59 | --- | fixed |
People
(Reporter: intermittent-bug-filer, Assigned: bryce)
References
Details
(Keywords: intermittent-failure, Whiteboard: [stockwell disabled][Pernosco])
Attachments
(1 file)
Filed by: wkocher [at] mozilla.com https://treeherder.mozilla.org/logviewer.html#?job_id=133374373&repo=autoland https://archive.mozilla.org/pub/firefox/tinderbox-builds/autoland-win64-debug/autoland_win10_64-debug_test-mochitest-media-e10s-bm111-tests1-windows-build102.txt.gz
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 6•7 years ago
|
||
this is failing at a very high failure rate, 135 failures in the last week alone- all on debug, but a variety of platforms. here is a recent windows10 log: https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-central&job_id=134865118&lineNumber=10087 and related text from the log: 03:24:40 INFO - TEST-PASS | dom/media/test/test_mediarecorder_pause_resume_video.html | Media recorder should be paused 03:24:40 INFO - Drawing color 0,0,255,1 03:24:40 INFO - Buffered messages logged at 03:24:38 03:24:40 INFO - Drawing color 0,255,0,1 03:24:40 INFO - Buffered messages logged at 03:24:39 03:24:40 INFO - Got 'dataavailable' event 03:24:40 INFO - Got 'stop' event 03:24:40 INFO - TEST-PASS | dom/media/test/test_mediarecorder_pause_resume_video.html | Media recorder should be incative after stop 03:24:40 INFO - TEST-PASS | dom/media/test/test_mediarecorder_pause_resume_video.html | Expected 1 dataavailable event 03:24:40 INFO - TEST-PASS | dom/media/test/test_mediarecorder_pause_resume_video.html | Should have gotten a data blob 03:24:40 INFO - TEST-PASS | dom/media/test/test_mediarecorder_pause_resume_video.html | video element should be same size as canvas once metadata is loaded 03:24:40 INFO - Buffered messages finished 03:24:40 INFO - TEST-UNEXPECTED-FAIL | dom/media/test/test_mediarecorder_pause_resume_video.html | Last frame should be green 03:24:40 INFO - startTest/mediaRecorder.onstop/video.onended@dom/media/test/test_mediarecorder_pause_resume_video.html:104:7 03:24:40 INFO - EventHandlerNonNull*startTest/mediaRecorder.onstop@dom/media/test/test_mediarecorder_pause_resume_video.html:103:5 03:24:40 INFO - EventHandlerNonNull*startTest@dom/media/test/test_mediarecorder_pause_resume_video.html:81:3 03:24:40 INFO - @dom/media/test/test_mediarecorder_pause_resume_video.html:125:1 03:24:40 INFO - GECKO(1416) | MEMORY STAT | vsize 1700MB | vsizeMaxContiguous 132540028MB | residentFast 109MB | heapAllocated 21MB 03:24:40 INFO - TEST-OK | dom/media/test/test_mediarecorder_pause_resume_video.html | took 4255ms :jib, I see you are the triage owner for av:recording, can you help prioritize this bug and get this in the queue for the right person to look at it?
Flags: needinfo?(jib)
Whiteboard: [stockwell needswork]
Comment hidden (Intermittent Failures Robot) |
Pushed by jmaher@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/f8a337cc1e67 Disable dom/media/test/test_mediarecorder_pause_resume_video.html on debug for frequent failures. r=me, a=test-only
Comment 9•7 years ago
|
||
please remember to re-enable this test when working on it.
Keywords: leave-open
Whiteboard: [stockwell needswork] → [stockwell disabled]
Comment 10•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/f8a337cc1e67
Comment 11•7 years ago
|
||
Bryce, this is a new test you added in bug 1354457. Please take a look. We should turn off the test temporarily if there's not a quick fix, then we can lower priority if needed.
Assignee: nobody → bvandyk
Rank: 10
Flags: needinfo?(jib) → needinfo?(bvandyk)
Priority: P5 → P1
Assignee | ||
Comment 12•7 years ago
|
||
I suspect this is a regression due to bug 1296531 landing based on the dates involved. I can repro locally, but it's fairly intermittent. Rough test schedule: - Draw red - Record (red) - Pause - Draw blue - Draw green - Resume (green) - Stop The screenshots indicate the last frame of the video is red, which means the recorder hasn't resumed and captured any of the green frames. Don't know why yet, will continue to look.
Flags: needinfo?(bvandyk)
Comment 13•7 years ago
|
||
If you adjust the timing of drawing green vs resume() (i.e., increase the nr of rAFs in between, say 72 instead of 62), does that affect the result?
Assignee | ||
Comment 14•7 years ago
|
||
Hard to be sure due to the intermittence (100s of iterations without failures for values I know can fail), but here are some results we me changing the resume value (while leaving draw green at 60): 62 (default): can repro 63: can repro 64: can repro 65+: haven't been able to repro If I change the value of when we draw green to 30 or 50 I'm unable to repro at all across a range of resume values. Due to the intermittence I'd talk these with some big grains of salt, but hopefully it's more useful than no data.
Comment 15•7 years ago
|
||
Clearly there's a larger than anticipated delay between drawing a frame (it should be captured at the end of requestAnimationFrame()), until it gets into MSG (on the next MSG iteration in NotifyPull()) and further all the way through MediaRecorder (the encoder TaskQueue basically). Some logging that would be useful would be when a new frame arrives to the TaskQueue vs Resume(); when a new frame is handled by the TaskQueue vs Resume(); and when we forward mLastChunk to the encoder on end-of-stream.
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 17•7 years ago
|
||
Could you elaborate on what you think is going on with the pause and resume and the delays. I would have thought if we're seeing delays in delivery of frames then shifting the resume to take place later would not alleviate the issue. Questions to help clarify my understanding and allow me to look at/add appropriate logs: by the time requestAnimationFrame has run draw(), the color of the canvas will be stable, as will the state of the recorder? How does/should this interact with the MSG and drivers? Do we have any guarantees that drivers don't tick during JS execution (I believe not)? The MSG performs a pull on the data from the canvas (via NotifyPull)? This data is used by SetCurrentFrames (MSG thread) and sent to AppendVideoSegment (MediaRecorder thread), further down the MSG processing pipeline this data is again processed by NotifyQueuedChanges (MSG thread) and sent to AdvanceCurrentTime (MediaRecorder thread), which we use to null queued data when suspended? As above, is there any sync here between the main thread and these? Does the MSG need to hit mainthread to pull form the canvas? Chrome seem to do some syncing to make sure their recorder and canvas are synced (this pen is an extreme example: https://codepen.io/SingingTree/full/eGZdQX/). I need to investigate what happens if we're doing drawing with workers. Despite that the regression concerns me, I wonder also if this test is entirely sane given what I've learnt since first writing it.
Comment 18•7 years ago
|
||
When we draw() in the requestAnimationFrame() callback, there's not really an async step until the capture happens. It happens late during the RefreshDriver Tick on main thread (rAF happens earlier). However, when we capture a frame (let's call this A), we don't immediately append it to the captured track, [1]. It takes until the next MSG iteration's NotifyPull() to append the new frame, [2] (let's call this B). (note that in the long term this will change to append immediately -- we need to get rid of the notion of a duration from video chunks first, and preferably some locking from AppendToTrack. But doing this should help your codepen with sync.) When this AppendToTrack() happens, MediaRecorder receives the new frame immediately (since it uses a direct track listener) and dispatches the frame to VideoTrackEncoder::AppendVideoSegment() on the encoder thread, [3] (let's call this C) As far as races go, a frame takes this path: A - MainThread B - MSG thread C - Encoder thread While Resume() takes this path: MediaRecorder::Resume() - MainThread TrackEncoder::Resume() - Encoder thread B is the difference here, and B is dependent on audio callbacks for the current output device (when audio is present in the MSG, or on a timer if there's no audio -- we try to keep iterations to 10ms, [4]). Since you managed to get rid of the intermittent behavior by only changing timing of draw() or Resume(), I assume the race is somehow between B and Resume(), and not between B and Stop(). As the next step I'd look for anomalies in the timing between B and Resume() as they come in on the encoder thread (wherever you can repro). If you see a pattern there it should be simple to reproduce in a gtest and if needed debug that with RR. VideoTrackEncoder::AdvanceCurrentTime might be interesting to log too, since it also reads mSuspended, and it tells you how long the MSG iterations are. [1] http://searchfox.org/mozilla-central/rev/40b456626e2d0409b7034768b4d9526fc7235ea4/dom/media/CanvasCaptureMediaStream.cpp#208 [2] http://searchfox.org/mozilla-central/rev/40b456626e2d0409b7034768b4d9526fc7235ea4/dom/media/CanvasCaptureMediaStream.cpp#65 [3] http://searchfox.org/mozilla-central/rev/40b456626e2d0409b7034768b4d9526fc7235ea4/dom/media/encoder/TrackEncoder.cpp#453 [4] http://searchfox.org/mozilla-central/rev/40b456626e2d0409b7034768b4d9526fc7235ea4/dom/media/GraphDriver.h#38
Comment hidden (Intermittent Failures Robot) |
Updated•7 years ago
|
Priority: P1 → P2
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 30•7 years ago
|
||
All current failures are linux/asan. There was an increase in failure frequency around Dec 18.
Keywords: leave-open
Whiteboard: [stockwell disabled] → [stockwell needswork]
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 34•7 years ago
|
||
There have been 34 failures in the last week. This is failing on Linux x64 / asan, with a single exception on linux64-stylo-disabled / opt. Here is a recent log file: https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-central&job_id=153534459&lineNumber=10999 and a snippet with the failure: [task 2017-12-28T22:41:41.898Z] 22:41:41 INFO - TEST-PASS | dom/media/test/test_mediarecorder_pause_resume_video.html | Media recorder should be incative after stop 10995 [task 2017-12-28T22:41:41.899Z] 22:41:41 INFO - TEST-PASS | dom/media/test/test_mediarecorder_pause_resume_video.html | Expected 1 dataavailable event 10996 [task 2017-12-28T22:41:41.900Z] 22:41:41 INFO - TEST-PASS | dom/media/test/test_mediarecorder_pause_resume_video.html | Should have gotten a data blob 10997 [task 2017-12-28T22:41:41.900Z] 22:41:41 INFO - TEST-PASS | dom/media/test/test_mediarecorder_pause_resume_video.html | video element should be same size as canvas once metadata is loaded 10998 [task 2017-12-28T22:41:41.903Z] 22:41:41 INFO - Buffered messages finished 10999 [task 2017-12-28T22:41:41.904Z] 22:41:41 INFO - TEST-UNEXPECTED-FAIL | dom/media/test/test_mediarecorder_pause_resume_video.html | Last frame should be green 11000 [task 2017-12-28T22:41:41.904Z] 22:41:41 INFO - startTest/mediaRecorder.onstop/video.onended@dom/media/test/test_mediarecorder_pause_resume_video.html:104:7 11001 [task 2017-12-28T22:41:41.904Z] 22:41:41 INFO - EventHandlerNonNull*startTest/mediaRecorder.onstop@dom/media/test/test_mediarecorder_pause_resume_video.html:103:5 11002 [task 2017-12-28T22:41:41.904Z] 22:41:41 INFO - EventHandlerNonNull*startTest@dom/media/test/test_mediarecorder_pause_resume_video.html:81:3 11003 [task 2017-12-28T22:41:41.904Z] 22:41:41 INFO - @dom/media/test/test_mediarecorder_pause_resume_video.html:125:1 11004 [task 2017-12-28T22:41:41.904Z] 22:41:41 INFO - GECKO(1754) | MEMORY STAT | vsize 20973779MB | residentFast 1200MB 11005 [task 2017-12-28T22:41:41.905Z] 22:41:41 INFO - TEST-OK | dom/media/test/test_mediarecorder_pause_resume_video.html | took 4300ms
Comment hidden (Intermittent Failures Robot) |
Comment 36•6 years ago
|
||
This has failed 42 times in the last 7 days. It occurred only on Linux x64 platform affecting asan build type. Failing test suite: only on opt-mochitest-media-e10s. Link to a recent log: https://treeherder.mozilla.org/logviewer.html#?repo=autoland&job_id=153630587&lineNumber=11024 Part of that log: [task 2017-12-30T23:07:28.139Z] 23:07:28 INFO - Got 'dataavailable' event 11018 [task 2017-12-30T23:07:28.139Z] 23:07:28 INFO - Got 'stop' event 11019 [task 2017-12-30T23:07:28.140Z] 23:07:28 INFO - TEST-PASS | dom/media/test/test_mediarecorder_pause_resume_video.html | Media recorder should be incative after stop 11020 [task 2017-12-30T23:07:28.141Z] 23:07:28 INFO - TEST-PASS | dom/media/test/test_mediarecorder_pause_resume_video.html | Expected 1 dataavailable event 11021 [task 2017-12-30T23:07:28.142Z] 23:07:28 INFO - TEST-PASS | dom/media/test/test_mediarecorder_pause_resume_video.html | Should have gotten a data blob 11022 [task 2017-12-30T23:07:28.143Z] 23:07:28 INFO - TEST-PASS | dom/media/test/test_mediarecorder_pause_resume_video.html | video element should be same size as canvas once metadata is loaded 11023 [task 2017-12-30T23:07:28.143Z] 23:07:28 INFO - Buffered messages finished 11024 [task 2017-12-30T23:07:28.143Z] 23:07:28 INFO - TEST-UNEXPECTED-FAIL | dom/media/test/test_mediarecorder_pause_resume_video.html | Last frame should be green 11025 [task 2017-12-30T23:07:28.144Z] 23:07:28 INFO - startTest/mediaRecorder.onstop/video.onended@dom/media/test/test_mediarecorder_pause_resume_video.html:104:7 11026 [task 2017-12-30T23:07:28.145Z] 23:07:28 INFO - EventHandlerNonNull*startTest/mediaRecorder.onstop@dom/media/test/test_mediarecorder_pause_resume_video.html:103:5 11027 [task 2017-12-30T23:07:28.145Z] 23:07:28 INFO - EventHandlerNonNull*startTest@dom/media/test/test_mediarecorder_pause_resume_video.html:81:3 11028 [task 2017-12-30T23:07:28.145Z] 23:07:28 INFO - @dom/media/test/test_mediarecorder_pause_resume_video.html:125:1 11029 [task 2017-12-30T23:07:28.146Z] 23:07:28 INFO - GECKO(1639) | MEMORY STAT | vsize 20973777MB | residentFast 905MB 11030 [task 2017-12-30T23:07:28.146Z] 23:07:28 INFO - TEST-OK | dom/media/test/test_mediarecorder_pause_resume_video.html | took 4301ms 11031 [task 2017-12-30T23:07:28.147Z] 23:07:28 INFO - TEST-START | dom/media/test/test_mediarecorder_playback_can_repeat.html 11032 [task 2017-12-30T23:07:28.296Z] 23:07:28 INFO - GECKO(1639) | [DEBUG SHUTDOWN] Shutdown: decoder=0x61600044d080 state machine=0x61d000633880 11033 [task 2017-12-30T23:07:28.298Z] 23:07:28 INFO - GECKO(1639) | [DEBUG SHUTDOWN] Enter: state machine=0x6020003f6950 reader=0x60d00001cb50 11034 [task 2017-12-30T23:07:28.299Z] 23:07:28 INFO - GECKO(1639) | [DEBUG SHUTDOWN] Shutdown: reader=0x61e000048c80 shutdown demuxer=0x6020005a3150 11035 [task 2017-12-30T23:07:28.300Z] 23:07:28 INFO - GECKO(1639) | [DEBUG SHUTDOWN] Shutdown: pool=0x60400030ab90 count=1 11036 [task 2017-12-30T23:07:28.301Z] 23:07:28 INFO - GECKO(1639) | [DEBUG SHUTDOWN] operator(): pool=0x60400030ab90 shutdown=true count=0 11037 [task 2017-12-30T23:07:28.301Z] 23:07:28 INFO - GECKO(1639) | [DEBUG SHUTDOWN] TearDownDecoders: reader=0x61e000048c80 shut down audio task queue 11038 [task 2017-12-30T23:07:28.302Z] 23:07:28 INFO - GECKO(1639) | [DEBUG SHUTDOWN] TearDownDecoders: reader=0x61e000048c80 shut down video task queue 11039 [task 2017-12-30T23:07:28.303Z] 23:07:28 INFO - GECKO(1639) | [DEBUG SHUTDOWN] FinishShutdown: state machine=0x61d000633880 11040 [task 2017-12-30T23:07:28.304Z] 23:07:28 INFO - GECKO(1639) | [DEBUG SHUTDOWN] Unregister: decoder=0x61600044d080, count=1 11041 [task 2017-12-30T23:07:28.999Z] 23:07:28 INFO - GECKO(1639) | [DEBUG SHUTDOWN] Register: decoder=0x616000501680, count=2 11042 [task 2017-12-30T23:07:30.111Z] 23:07:30 INFO - GECKO(1639) | MEMORY STAT | vsize 20973770MB | residentFast 889MB 11043 [task 2017-12-30T23:07:30.128Z] 23:07:30 INFO - TEST-OK | dom/media/test/test_mediarecorder_playback_can_repeat.html | took 2428ms 11044 [task 2017-12-30T23:07:30.188Z] 23:07:30 INFO - TEST-START | dom/media/test/test_mediarecorder_record_4ch_audiocontext.html 11045 [task 2017-12-30T23:07:30.254Z] 23:07:30 INFO - GECKO(1639) | [DEBUG SHUTDOWN] Shutdown: decoder=0x616000209480 state machine=0x61d000c61680 11046 [task 2017-12-30T23:07:30.255Z] 23:07:30 INFO - GECKO(1639) | [DEBUG SHUTDOWN] Enter: state machine=0x602000613670 reader=0x60d00025b590 11047 [task 2017-12-30T23:07:30.255Z] 23:07:30 INFO - GECKO(1639) | [DEBUG SHUTDOWN] Shutdown: reader=0x61e000085880 shutdown demuxer=0x602000629c90 11048 [task 2017-12-30T23:07:30.256Z] 23:07:30 INFO - GECKO(1639) | [DEBUG SHUTDOWN] Shutdown: pool=0x6040001fc390 count=1 11049 [task 2017-12-30T23:07:30.256Z] 23:07:30 INFO - GECKO(1639) | [DEBUG SHUTDOWN] operator(): pool=0x6040001fc390 shutdown=true count=0 11050 [task 2017-12-30T23:07:30.256Z] 23:07:30 INFO - GECKO(1639) | [DEBUG SHUTDOWN] TearDownDecoders: reader=0x61e000085880 shut down audio task queue 11051 [task 2017-12-30T23:07:30.257Z] 23:07:30 INFO - GECKO(1639) | [DEBUG SHUTDOWN] TearDownDecoders: reader=0x61e000085880 shut down video task queue 11052 [task 2017-12-30T23:07:30.259Z] 23:07:30 INFO - GECKO(1639) | [DEBUG SHUTDOWN] FinishShutdown: state machine=0x61d000c61680 11053 [task 2017-12-30T23:07:30.260Z] 23:07:30 INFO - GECKO(1639) | [DEBUG SHUTDOWN] Unregister: decoder=0x616000209480, count=1 11054 [task 2017-12-30T23:07:31.478Z] 23:07:31 INFO - GECKO(1639) | MEMORY STAT | vsize 20973795MB | residentFast 901MB 11055 [task 2017-12-30T23:07:31.494Z] 23:07:31 INFO - TEST-OK | dom/media/test/test_mediarecorder_record_4ch_audiocontext.html | took 1302ms 11056 [task 2017-12-30T23:07:31.559Z] 23:07:31 INFO - TEST-START | dom/media/test/test_mediarecorder_record_addtracked_stream.html
Whiteboard: [stockwell unknown] → [stockwell needswork]
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (mozreview-request) |
Updated•6 years ago
|
Flags: needinfo?(jib)
Assignee | ||
Comment 41•6 years ago
|
||
mozreview-review |
Comment on attachment 8940916 [details] Bug 1403307 - Disable dom/media/test/test_mediarecorder_pause_resume_video.html on linux64 as well for frequent failures. https://reviewboard.mozilla.org/r/211180/#review217084
Attachment #8940916 -
Flags: review?(bvandyk) → review+
Comment 42•6 years ago
|
||
Pushed by jbruaroey@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/ee54a2ff9373 Disable dom/media/test/test_mediarecorder_pause_resume_video.html on linux64 as well for frequent failures. r=bryce
Comment 43•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/ee54a2ff9373
Status: NEW → RESOLVED
Closed: 6 years ago
status-firefox59:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
Updated•6 years ago
|
Whiteboard: [stockwell needswork] → [stockwell disabled]
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 49•6 years ago
|
||
I've debugged a case of this intermittent, and what happens is that the last (green) frame hits the underrun path in [1]. The only way we can append the same frame again is then to wait 1 second, but we reach EOS before that happens. I need to look at why exactly we hit the underrun path here, and check whether some logic is flawed. [1] https://searchfox.org/mozilla-central/rev/ce9ff94ffed34dc17ec0bfa406156d489eaa8ee1/dom/media/encoder/TrackEncoder.cpp#769
Whiteboard: [stockwell disabled] → [stockwell disabled][Pernosco]
Comment 50•6 years ago
|
||
This issue is still occurring: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-classifiedState=unclassified&fromchange=093b3a6f608998cc9d1fb66d68445bf2e30e84df&autoclassify=&selectedJob=178132384&filter-searchStr=Linux+x64+asan+Mochitests+with+e10s+test-linux64-asan%2Fopt-mochitest-media-e10s-1+M-e10s%28mda1%29 https://treeherder.mozilla.org/logviewer.html#?job_id=178132384&repo=mozilla-inbound
Comment hidden (Intermittent Failures Robot) |
You need to log in
before you can comment on or make changes to this bug.
Description
•