Closed
Bug 1403307
Opened 8 years ago
Closed 8 years ago
Intermittent dom/media/test/test_mediarecorder_pause_resume_video.html | Last frame should be green
Categories
(Core :: Audio/Video: Recording, defect, P2)
Core
Audio/Video: Recording
Tracking
()
RESOLVED
FIXED
mozilla59
Tracking | Status | |
---|---|---|
firefox59 | --- | fixed |
People
(Reporter: intermittent-bug-filer, Assigned: bryce)
References
Details
(Keywords: intermittent-failure, Whiteboard: [stockwell disabled][Pernosco])
Attachments
(1 file)
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 6•8 years ago
|
||
this is failing at a very high failure rate, 135 failures in the last week alone- all on debug, but a variety of platforms.
here is a recent windows10 log:
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-central&job_id=134865118&lineNumber=10087
and related text from the log:
03:24:40 INFO - TEST-PASS | dom/media/test/test_mediarecorder_pause_resume_video.html | Media recorder should be paused
03:24:40 INFO - Drawing color 0,0,255,1
03:24:40 INFO - Buffered messages logged at 03:24:38
03:24:40 INFO - Drawing color 0,255,0,1
03:24:40 INFO - Buffered messages logged at 03:24:39
03:24:40 INFO - Got 'dataavailable' event
03:24:40 INFO - Got 'stop' event
03:24:40 INFO - TEST-PASS | dom/media/test/test_mediarecorder_pause_resume_video.html | Media recorder should be incative after stop
03:24:40 INFO - TEST-PASS | dom/media/test/test_mediarecorder_pause_resume_video.html | Expected 1 dataavailable event
03:24:40 INFO - TEST-PASS | dom/media/test/test_mediarecorder_pause_resume_video.html | Should have gotten a data blob
03:24:40 INFO - TEST-PASS | dom/media/test/test_mediarecorder_pause_resume_video.html | video element should be same size as canvas once metadata is loaded
03:24:40 INFO - Buffered messages finished
03:24:40 INFO - TEST-UNEXPECTED-FAIL | dom/media/test/test_mediarecorder_pause_resume_video.html | Last frame should be green
03:24:40 INFO - startTest/mediaRecorder.onstop/video.onended@dom/media/test/test_mediarecorder_pause_resume_video.html:104:7
03:24:40 INFO - EventHandlerNonNull*startTest/mediaRecorder.onstop@dom/media/test/test_mediarecorder_pause_resume_video.html:103:5
03:24:40 INFO - EventHandlerNonNull*startTest@dom/media/test/test_mediarecorder_pause_resume_video.html:81:3
03:24:40 INFO - @dom/media/test/test_mediarecorder_pause_resume_video.html:125:1
03:24:40 INFO - GECKO(1416) | MEMORY STAT | vsize 1700MB | vsizeMaxContiguous 132540028MB | residentFast 109MB | heapAllocated 21MB
03:24:40 INFO - TEST-OK | dom/media/test/test_mediarecorder_pause_resume_video.html | took 4255ms
:jib, I see you are the triage owner for av:recording, can you help prioritize this bug and get this in the queue for the right person to look at it?
Flags: needinfo?(jib)
Whiteboard: [stockwell needswork]
Comment hidden (Intermittent Failures Robot) |
Pushed by jmaher@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/f8a337cc1e67
Disable dom/media/test/test_mediarecorder_pause_resume_video.html on debug for frequent failures. r=me, a=test-only
Comment 9•8 years ago
|
||
please remember to re-enable this test when working on it.
Keywords: leave-open
Whiteboard: [stockwell needswork] → [stockwell disabled]
![]() |
||
Comment 10•8 years ago
|
||
bugherder |
Comment 11•8 years ago
|
||
Bryce, this is a new test you added in bug 1354457. Please take a look. We should turn off the test temporarily if there's not a quick fix, then we can lower priority if needed.
Assignee: nobody → bvandyk
Rank: 10
Flags: needinfo?(jib) → needinfo?(bvandyk)
Priority: P5 → P1
Assignee | ||
Comment 12•8 years ago
|
||
I suspect this is a regression due to bug 1296531 landing based on the dates involved. I can repro locally, but it's fairly intermittent.
Rough test schedule:
- Draw red
- Record (red)
- Pause
- Draw blue
- Draw green
- Resume (green)
- Stop
The screenshots indicate the last frame of the video is red, which means the recorder hasn't resumed and captured any of the green frames. Don't know why yet, will continue to look.
Flags: needinfo?(bvandyk)
Comment 13•8 years ago
|
||
If you adjust the timing of drawing green vs resume() (i.e., increase the nr of rAFs in between, say 72 instead of 62), does that affect the result?
Assignee | ||
Comment 14•8 years ago
|
||
Hard to be sure due to the intermittence (100s of iterations without failures for values I know can fail), but here are some results we me changing the resume value (while leaving draw green at 60):
62 (default): can repro
63: can repro
64: can repro
65+: haven't been able to repro
If I change the value of when we draw green to 30 or 50 I'm unable to repro at all across a range of resume values.
Due to the intermittence I'd talk these with some big grains of salt, but hopefully it's more useful than no data.
Comment 15•8 years ago
|
||
Clearly there's a larger than anticipated delay between drawing a frame (it should be captured at the end of requestAnimationFrame()), until it gets into MSG (on the next MSG iteration in NotifyPull()) and further all the way through MediaRecorder (the encoder TaskQueue basically).
Some logging that would be useful would be when a new frame arrives to the TaskQueue vs Resume(); when a new frame is handled by the TaskQueue vs Resume(); and when we forward mLastChunk to the encoder on end-of-stream.
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 17•8 years ago
|
||
Could you elaborate on what you think is going on with the pause and resume and the delays. I would have thought if we're seeing delays in delivery of frames then shifting the resume to take place later would not alleviate the issue.
Questions to help clarify my understanding and allow me to look at/add appropriate logs: by the time requestAnimationFrame has run draw(), the color of the canvas will be stable, as will the state of the recorder? How does/should this interact with the MSG and drivers? Do we have any guarantees that drivers don't tick during JS execution (I believe not)?
The MSG performs a pull on the data from the canvas (via NotifyPull)? This data is used by SetCurrentFrames (MSG thread) and sent to AppendVideoSegment (MediaRecorder thread), further down the MSG processing pipeline this data is again processed by NotifyQueuedChanges (MSG thread) and sent to AdvanceCurrentTime (MediaRecorder thread), which we use to null queued data when suspended? As above, is there any sync here between the main thread and these? Does the MSG need to hit mainthread to pull form the canvas?
Chrome seem to do some syncing to make sure their recorder and canvas are synced (this pen is an extreme example: https://codepen.io/SingingTree/full/eGZdQX/). I need to investigate what happens if we're doing drawing with workers. Despite that the regression concerns me, I wonder also if this test is entirely sane given what I've learnt since first writing it.
Comment 18•8 years ago
|
||
When we draw() in the requestAnimationFrame() callback, there's not really an async step until the capture happens. It happens late during the RefreshDriver Tick on main thread (rAF happens earlier).
However, when we capture a frame (let's call this A), we don't immediately append it to the captured track, [1].
It takes until the next MSG iteration's NotifyPull() to append the new frame, [2] (let's call this B).
(note that in the long term this will change to append immediately -- we need to get rid of the notion of a duration from video chunks first, and preferably some locking from AppendToTrack. But doing this should help your codepen with sync.)
When this AppendToTrack() happens, MediaRecorder receives the new frame immediately (since it uses a direct track listener) and dispatches the frame to VideoTrackEncoder::AppendVideoSegment() on the encoder thread, [3] (let's call this C)
As far as races go, a frame takes this path:
A - MainThread
B - MSG thread
C - Encoder thread
While Resume() takes this path:
MediaRecorder::Resume() - MainThread
TrackEncoder::Resume() - Encoder thread
B is the difference here, and B is dependent on audio callbacks for the current output device (when audio is present in the MSG, or on a timer if there's no audio -- we try to keep iterations to 10ms, [4]). Since you managed to get rid of the intermittent behavior by only changing timing of draw() or Resume(), I assume the race is somehow between B and Resume(), and not between B and Stop().
As the next step I'd look for anomalies in the timing between B and Resume() as they come in on the encoder thread (wherever you can repro). If you see a pattern there it should be simple to reproduce in a gtest and if needed debug that with RR.
VideoTrackEncoder::AdvanceCurrentTime might be interesting to log too, since it also reads mSuspended, and it tells you how long the MSG iterations are.
[1] http://searchfox.org/mozilla-central/rev/40b456626e2d0409b7034768b4d9526fc7235ea4/dom/media/CanvasCaptureMediaStream.cpp#208
[2] http://searchfox.org/mozilla-central/rev/40b456626e2d0409b7034768b4d9526fc7235ea4/dom/media/CanvasCaptureMediaStream.cpp#65
[3] http://searchfox.org/mozilla-central/rev/40b456626e2d0409b7034768b4d9526fc7235ea4/dom/media/encoder/TrackEncoder.cpp#453
[4] http://searchfox.org/mozilla-central/rev/40b456626e2d0409b7034768b4d9526fc7235ea4/dom/media/GraphDriver.h#38
Comment hidden (Intermittent Failures Robot) |
Updated•8 years ago
|
Priority: P1 → P2
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
![]() |
||
Comment 30•8 years ago
|
||
All current failures are linux/asan. There was an increase in failure frequency around Dec 18.
Keywords: leave-open
Whiteboard: [stockwell disabled] → [stockwell needswork]
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 34•8 years ago
|
||
There have been 34 failures in the last week.
This is failing on Linux x64 / asan, with a single exception on linux64-stylo-disabled / opt.
Here is a recent log file:
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-central&job_id=153534459&lineNumber=10999
and a snippet with the failure:
[task 2017-12-28T22:41:41.898Z] 22:41:41 INFO - TEST-PASS | dom/media/test/test_mediarecorder_pause_resume_video.html | Media recorder should be incative after stop
10995
[task 2017-12-28T22:41:41.899Z] 22:41:41 INFO - TEST-PASS | dom/media/test/test_mediarecorder_pause_resume_video.html | Expected 1 dataavailable event
10996
[task 2017-12-28T22:41:41.900Z] 22:41:41 INFO - TEST-PASS | dom/media/test/test_mediarecorder_pause_resume_video.html | Should have gotten a data blob
10997
[task 2017-12-28T22:41:41.900Z] 22:41:41 INFO - TEST-PASS | dom/media/test/test_mediarecorder_pause_resume_video.html | video element should be same size as canvas once metadata is loaded
10998
[task 2017-12-28T22:41:41.903Z] 22:41:41 INFO - Buffered messages finished
10999
[task 2017-12-28T22:41:41.904Z] 22:41:41 INFO - TEST-UNEXPECTED-FAIL | dom/media/test/test_mediarecorder_pause_resume_video.html | Last frame should be green
11000
[task 2017-12-28T22:41:41.904Z] 22:41:41 INFO - startTest/mediaRecorder.onstop/video.onended@dom/media/test/test_mediarecorder_pause_resume_video.html:104:7
11001
[task 2017-12-28T22:41:41.904Z] 22:41:41 INFO - EventHandlerNonNull*startTest/mediaRecorder.onstop@dom/media/test/test_mediarecorder_pause_resume_video.html:103:5
11002
[task 2017-12-28T22:41:41.904Z] 22:41:41 INFO - EventHandlerNonNull*startTest@dom/media/test/test_mediarecorder_pause_resume_video.html:81:3
11003
[task 2017-12-28T22:41:41.904Z] 22:41:41 INFO - @dom/media/test/test_mediarecorder_pause_resume_video.html:125:1
11004
[task 2017-12-28T22:41:41.904Z] 22:41:41 INFO - GECKO(1754) | MEMORY STAT | vsize 20973779MB | residentFast 1200MB
11005
[task 2017-12-28T22:41:41.905Z] 22:41:41 INFO - TEST-OK | dom/media/test/test_mediarecorder_pause_resume_video.html | took 4300ms
Comment hidden (Intermittent Failures Robot) |
Comment 36•8 years ago
|
||
This has failed 42 times in the last 7 days. It occurred only on Linux x64 platform affecting asan build type.
Failing test suite: only on opt-mochitest-media-e10s.
Link to a recent log: https://treeherder.mozilla.org/logviewer.html#?repo=autoland&job_id=153630587&lineNumber=11024
Part of that log:
[task 2017-12-30T23:07:28.139Z] 23:07:28 INFO - Got 'dataavailable' event
11018
[task 2017-12-30T23:07:28.139Z] 23:07:28 INFO - Got 'stop' event
11019
[task 2017-12-30T23:07:28.140Z] 23:07:28 INFO - TEST-PASS | dom/media/test/test_mediarecorder_pause_resume_video.html | Media recorder should be incative after stop
11020
[task 2017-12-30T23:07:28.141Z] 23:07:28 INFO - TEST-PASS | dom/media/test/test_mediarecorder_pause_resume_video.html | Expected 1 dataavailable event
11021
[task 2017-12-30T23:07:28.142Z] 23:07:28 INFO - TEST-PASS | dom/media/test/test_mediarecorder_pause_resume_video.html | Should have gotten a data blob
11022
[task 2017-12-30T23:07:28.143Z] 23:07:28 INFO - TEST-PASS | dom/media/test/test_mediarecorder_pause_resume_video.html | video element should be same size as canvas once metadata is loaded
11023
[task 2017-12-30T23:07:28.143Z] 23:07:28 INFO - Buffered messages finished
11024
[task 2017-12-30T23:07:28.143Z] 23:07:28 INFO - TEST-UNEXPECTED-FAIL | dom/media/test/test_mediarecorder_pause_resume_video.html | Last frame should be green
11025
[task 2017-12-30T23:07:28.144Z] 23:07:28 INFO - startTest/mediaRecorder.onstop/video.onended@dom/media/test/test_mediarecorder_pause_resume_video.html:104:7
11026
[task 2017-12-30T23:07:28.145Z] 23:07:28 INFO - EventHandlerNonNull*startTest/mediaRecorder.onstop@dom/media/test/test_mediarecorder_pause_resume_video.html:103:5
11027
[task 2017-12-30T23:07:28.145Z] 23:07:28 INFO - EventHandlerNonNull*startTest@dom/media/test/test_mediarecorder_pause_resume_video.html:81:3
11028
[task 2017-12-30T23:07:28.145Z] 23:07:28 INFO - @dom/media/test/test_mediarecorder_pause_resume_video.html:125:1
11029
[task 2017-12-30T23:07:28.146Z] 23:07:28 INFO - GECKO(1639) | MEMORY STAT | vsize 20973777MB | residentFast 905MB
11030
[task 2017-12-30T23:07:28.146Z] 23:07:28 INFO - TEST-OK | dom/media/test/test_mediarecorder_pause_resume_video.html | took 4301ms
11031
[task 2017-12-30T23:07:28.147Z] 23:07:28 INFO - TEST-START | dom/media/test/test_mediarecorder_playback_can_repeat.html
11032
[task 2017-12-30T23:07:28.296Z] 23:07:28 INFO - GECKO(1639) | [DEBUG SHUTDOWN] Shutdown: decoder=0x61600044d080 state machine=0x61d000633880
11033
[task 2017-12-30T23:07:28.298Z] 23:07:28 INFO - GECKO(1639) | [DEBUG SHUTDOWN] Enter: state machine=0x6020003f6950 reader=0x60d00001cb50
11034
[task 2017-12-30T23:07:28.299Z] 23:07:28 INFO - GECKO(1639) | [DEBUG SHUTDOWN] Shutdown: reader=0x61e000048c80 shutdown demuxer=0x6020005a3150
11035
[task 2017-12-30T23:07:28.300Z] 23:07:28 INFO - GECKO(1639) | [DEBUG SHUTDOWN] Shutdown: pool=0x60400030ab90 count=1
11036
[task 2017-12-30T23:07:28.301Z] 23:07:28 INFO - GECKO(1639) | [DEBUG SHUTDOWN] operator(): pool=0x60400030ab90 shutdown=true count=0
11037
[task 2017-12-30T23:07:28.301Z] 23:07:28 INFO - GECKO(1639) | [DEBUG SHUTDOWN] TearDownDecoders: reader=0x61e000048c80 shut down audio task queue
11038
[task 2017-12-30T23:07:28.302Z] 23:07:28 INFO - GECKO(1639) | [DEBUG SHUTDOWN] TearDownDecoders: reader=0x61e000048c80 shut down video task queue
11039
[task 2017-12-30T23:07:28.303Z] 23:07:28 INFO - GECKO(1639) | [DEBUG SHUTDOWN] FinishShutdown: state machine=0x61d000633880
11040
[task 2017-12-30T23:07:28.304Z] 23:07:28 INFO - GECKO(1639) | [DEBUG SHUTDOWN] Unregister: decoder=0x61600044d080, count=1
11041
[task 2017-12-30T23:07:28.999Z] 23:07:28 INFO - GECKO(1639) | [DEBUG SHUTDOWN] Register: decoder=0x616000501680, count=2
11042
[task 2017-12-30T23:07:30.111Z] 23:07:30 INFO - GECKO(1639) | MEMORY STAT | vsize 20973770MB | residentFast 889MB
11043
[task 2017-12-30T23:07:30.128Z] 23:07:30 INFO - TEST-OK | dom/media/test/test_mediarecorder_playback_can_repeat.html | took 2428ms
11044
[task 2017-12-30T23:07:30.188Z] 23:07:30 INFO - TEST-START | dom/media/test/test_mediarecorder_record_4ch_audiocontext.html
11045
[task 2017-12-30T23:07:30.254Z] 23:07:30 INFO - GECKO(1639) | [DEBUG SHUTDOWN] Shutdown: decoder=0x616000209480 state machine=0x61d000c61680
11046
[task 2017-12-30T23:07:30.255Z] 23:07:30 INFO - GECKO(1639) | [DEBUG SHUTDOWN] Enter: state machine=0x602000613670 reader=0x60d00025b590
11047
[task 2017-12-30T23:07:30.255Z] 23:07:30 INFO - GECKO(1639) | [DEBUG SHUTDOWN] Shutdown: reader=0x61e000085880 shutdown demuxer=0x602000629c90
11048
[task 2017-12-30T23:07:30.256Z] 23:07:30 INFO - GECKO(1639) | [DEBUG SHUTDOWN] Shutdown: pool=0x6040001fc390 count=1
11049
[task 2017-12-30T23:07:30.256Z] 23:07:30 INFO - GECKO(1639) | [DEBUG SHUTDOWN] operator(): pool=0x6040001fc390 shutdown=true count=0
11050
[task 2017-12-30T23:07:30.256Z] 23:07:30 INFO - GECKO(1639) | [DEBUG SHUTDOWN] TearDownDecoders: reader=0x61e000085880 shut down audio task queue
11051
[task 2017-12-30T23:07:30.257Z] 23:07:30 INFO - GECKO(1639) | [DEBUG SHUTDOWN] TearDownDecoders: reader=0x61e000085880 shut down video task queue
11052
[task 2017-12-30T23:07:30.259Z] 23:07:30 INFO - GECKO(1639) | [DEBUG SHUTDOWN] FinishShutdown: state machine=0x61d000c61680
11053
[task 2017-12-30T23:07:30.260Z] 23:07:30 INFO - GECKO(1639) | [DEBUG SHUTDOWN] Unregister: decoder=0x616000209480, count=1
11054
[task 2017-12-30T23:07:31.478Z] 23:07:31 INFO - GECKO(1639) | MEMORY STAT | vsize 20973795MB | residentFast 901MB
11055
[task 2017-12-30T23:07:31.494Z] 23:07:31 INFO - TEST-OK | dom/media/test/test_mediarecorder_record_4ch_audiocontext.html | took 1302ms
11056
[task 2017-12-30T23:07:31.559Z] 23:07:31 INFO - TEST-START | dom/media/test/test_mediarecorder_record_addtracked_stream.html
Whiteboard: [stockwell unknown] → [stockwell needswork]
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (mozreview-request) |
Updated•8 years ago
|
Flags: needinfo?(jib)
Assignee | ||
Comment 41•8 years ago
|
||
mozreview-review |
Comment on attachment 8940916 [details]
Bug 1403307 - Disable dom/media/test/test_mediarecorder_pause_resume_video.html on linux64 as well for frequent failures.
https://reviewboard.mozilla.org/r/211180/#review217084
Attachment #8940916 -
Flags: review?(bvandyk) → review+
Comment 42•8 years ago
|
||
Pushed by jbruaroey@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ee54a2ff9373
Disable dom/media/test/test_mediarecorder_pause_resume_video.html on linux64 as well for frequent failures. r=bryce
Comment 43•8 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox59:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
Updated•8 years ago
|
Whiteboard: [stockwell needswork] → [stockwell disabled]
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 49•7 years ago
|
||
I've debugged a case of this intermittent, and what happens is that the last (green) frame hits the underrun path in [1].
The only way we can append the same frame again is then to wait 1 second, but we reach EOS before that happens.
I need to look at why exactly we hit the underrun path here, and check whether some logic is flawed.
[1] https://searchfox.org/mozilla-central/rev/ce9ff94ffed34dc17ec0bfa406156d489eaa8ee1/dom/media/encoder/TrackEncoder.cpp#769
Whiteboard: [stockwell disabled] → [stockwell disabled][Pernosco]
Comment 50•7 years ago
|
||
This issue is still occurring:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-classifiedState=unclassified&fromchange=093b3a6f608998cc9d1fb66d68445bf2e30e84df&autoclassify=&selectedJob=178132384&filter-searchStr=Linux+x64+asan+Mochitests+with+e10s+test-linux64-asan%2Fopt-mochitest-media-e10s-1+M-e10s%28mda1%29
https://treeherder.mozilla.org/logviewer.html#?job_id=178132384&repo=mozilla-inbound
Comment hidden (Intermittent Failures Robot) |
You need to log in
before you can comment on or make changes to this bug.
Description
•