Intermittent dom/media/tests/mochitest/test_peerConnection_stats.html | Error in test execution: Error: Waiting for synced RTCP timed out after at least 15000ms waitForSyncedRtcp@http://mochi.test:8888/tests/dom/media/tests/mochitest/test_peerConnection_
Categories
(Core :: WebRTC: Signaling, defect, P2)
Tracking
()
People
(Reporter: intermittent-bug-filer, Assigned: bwc)
Details
(Keywords: bulk-close-intermittents, intermittent-failure, Whiteboard: [stockwell unknown])
Attachments
(1 file)
Updated•8 years ago
|
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Updated•8 years ago
|
![]() |
||
Comment 4•8 years ago
|
||
Comment hidden (Intermittent Failures Robot) |
Comment 6•8 years ago
|
||
Updated•8 years ago
|
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 11•7 years ago
|
||
Comment 12•7 years ago
|
||
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 31•6 years ago
|
||
There are 21 total failures on linux, osx and windows in the last 7 days.
Recent failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=231760068&repo=autoland&lineNumber=11574
23:09:07 INFO - TEST-FAIL | dom/media/tests/mochitest/test_peerConnection_stats.html | The author of the test has indicated that flaky timeouts are expected. Reason: WebRTC inherently depends on timeouts
23:09:07 INFO - outbound_rtcp_audio_0 is missing roundTripTime: {"id":"outbound_rtcp_audio_0","timestamp":1551740946827,"type":"remote-inbound-rtp","kind":"audio","localId":"outbound_rtp_audio_0","mediaType":"audio","ssrc":848272873,"bytesReceived":176855,"jitter":0,"packetsLost":0,"packetsReceived":1098}
23:09:07 INFO - waitForSyncedRtcp: no synced RTCP on attempt30, retrying.
23:09:07 INFO -
23:09:07 INFO - TEST-FAIL | dom/media/tests/mochitest/test_peerConnection_stats.html | The author of the test has indicated that flaky timeouts are expected. Reason: WebRTC inherently depends on timeouts
23:09:07 INFO - Buffered messages finished
23:09:07 INFO - TEST-UNEXPECTED-FAIL | dom/media/tests/mochitest/test_peerConnection_stats.html | Error in test execution: Error: Waiting for synced RTCP timed out after at least 15000ms waitForSyncedRtcp@http://mochi.test:8888/tests/dom/media/tests/mochitest/pc.js:1674:11 ... asyncPC_LOCAL_TEST_LOCAL_STATS@http://mochi.test:8888/tests/dom/media/tests/mochitest/test_peerConnection_stats.html:782:23 ... execute/</<@http://mochi.test:8888/tests/dom/media/tests/mochitest/head.js:852:31 ... promise callbackexecute/<@http://mochi.test:8888/tests/dom/media/tests/mochitest/head.js:850:19 ... execute@http://mochi.test:8888/tests/dom/media/tests/mochitest/head.js:845:26 ... PeerConnectionTest.prototype.run@http://mochi.test:8888/tests/dom/media/tests/mochitest/pc.js:486:21 ... @http://mochi.test:8888/tests/dom/media/tests/mochitest/test_peerConnection_stats.html:806:8 ... runNetworkTest/<@http://mochi.test:8888/tests/dom/media/tests/mochitest/pc.js:2187:13 ... asyncrunTestWhenReady/<@http://mochi.test:8888/tests/dom/media/tests/mochitest/head.js:473:41 ... promise callbackrunTestWhenReady@http://mochi.test:8888/tests/dom/media/tests/mochitest/head.js:473:25 ... runNetworkTest@http://mochi.test:8888/tests/dom/media/tests/mochitest/pc.js:2169:9 ... async*@http://mochi.test:8888/tests/dom/media/tests/mochitest/test_peerConnection_stats.html:795:1 ...
23:09:07 INFO - SimpleTest.ok@SimpleTest/SimpleTest.js:275:18
23:09:07 INFO - execute/<@dom/media/tests/mochitest/head.js:856:14
23:09:07 INFO - promise callbackexecute@dom/media/tests/mochitest/head.js:855:8
23:09:07 INFO - PeerConnectionTest.prototype.run@dom/media/tests/mochitest/pc.js:486:21
23:09:07 INFO - @dom/media/tests/mochitest/test_peerConnection_stats.html:806:8
23:09:07 INFO - runNetworkTest/<@dom/media/tests/mochitest/pc.js:2187:13
23:09:07 INFO - asyncrunTestWhenReady/<@dom/media/tests/mochitest/head.js:473:41
23:09:07 INFO - promise callbackrunTestWhenReady@dom/media/tests/mochitest/head.js:473:25
23:09:07 INFO - runNetworkTest@dom/media/tests/mochitest/pc.js:2169:9
23:09:07 INFO - async@dom/media/tests/mochitest/test_peerConnection_stats.html:795:1
23:09:07 INFO - TEST-FAIL | dom/media/tests/mochitest/test_peerConnection_stats.html | The author of the test has indicated that flaky timeouts are expected. Reason: WebRTC inherently depends on timeouts
23:09:07 INFO - Closing peer connections
23:09:07 INFO - Waiting for track {ad2e26a6-4860-4abd-a5c6-51086e82a257} (audio) to end.
23:09:07 INFO - TEST-FAIL | dom/media/tests/mochitest/test_peerConnection_stats.html | The author of the test has indicated that flaky timeouts are expected. Reason: WebRTC inherently depends on timeouts
23:09:07 INFO - Waiting for track {25128642-094e-4643-b047-85a54f16bfb2} (video) to end.
23:09:07 INFO - TEST-FAIL | dom/media/tests/mochitest/test_peerConnection_stats.html | The author of the test has indicated that flaky timeouts are expected. Reason: WebRTC inherently depends on timeouts
23:09:07 INFO - TEST-PASS | dom/media/tests/mochitest/test_peerConnection_stats.html | iceConnectionState should not be undefined
23:09:07 INFO - PeerConnectionWrapper (pcLocal): oniceconnectionstatechange fired, new state is: closed
23:09:07 INFO - TEST-PASS | dom/media/tests/mochitest/test_peerConnection_stats.html | PeerConnectionWrapper (pcLocal): legal ICE state transition from connected to closed
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 36•6 years ago
|
||
When playing with these stats locally, the rtt seems to come and go. When it is there, it is in the single digits. This may simply be a case of the RTT being effectively 0 ms, since the tests are running locally over loopback.
@ng, would it make sense to stop eliding this statistic when it is calculated to be 0? Do we need to have a separate test that introduces network delay and verifies that it eventually takes a positive value?
Assignee | ||
Comment 37•6 years ago
|
||
I should also point out that we are legitimately seeing cases where it takes surprisingly long to see RTCP packets. Why would this be?
Comment 38•6 years ago
|
||
(In reply to Byron Campen [:bwc] from comment #36)
When playing with these stats locally, the rtt seems to come and go. When it is there, it is in the single digits. This may simply be a case of the RTT being effectively 0 ms, since the tests are running locally over loopback.
@ng, would it make sense to stop eliding this statistic when it is calculated to be 0? Do we need to have a separate test that introduces network delay and verifies that it eventually takes a positive value?
I think we should open a bug on this behavior.
The elision of received 0s is mandated by the spec: https://w3c.github.io/webrtc-stats/#dom-rtcremoteinboundrtpstreamstats if the DSLR is 0. I a brief look through the jumble of places that libwebrtc calculates and stores RTT doesn't make it immediately apparent as to if a reported rtt of 0 implies that DLSR was exactly zero (less than 1/2^16 of a second). All that said, I have never been fond of the elision, and I think we should report what was reported in the packet. If it is found that the DLSR is often zero in local connections I will open a bug on the spec.
The current tests do wait for RTT to appear at least once (and all appearances should be > 0). https://searchfox.org/mozilla-central/source/dom/media/tests/mochitest/pc.js#1638 I don't think it would be harmful to add a bit of latency to the existing test.
Comment 39•6 years ago
|
||
(In reply to Byron Campen [:bwc] from comment #37)
I should also point out that we are legitimately seeing cases where it takes surprisingly long to see RTCP packets. Why would this be?
What length of time are you seeing? There is logic in libwebrtc that controls the random length of time we wait before sending the first RTCP packets. We have been maintaining a changeset that significantly shortened that time. It is entirely possible that something changed in the latest webrtc.org update that effected this.
Assignee | ||
Comment 40•6 years ago
|
||
In the cases where we see this intermittent failure, it takes about 14 seconds to get audio RTCP stats.
Comment 41•6 years ago
|
||
(In reply to Byron Campen [:bwc] from comment #40)
In the cases where we see this intermittent failure, it takes about 14 seconds to get audio RTCP stats.
That could very well be that we are never getting a non-zero round trip time.
Assignee | ||
Comment 42•6 years ago
|
||
That 14-second delay is not just rtt; no RTCP stats at all come in. When they do come in, there's no rtt.
Comment 43•6 years ago
•
|
||
The RTT could be held up on waiting for an SR. A receiver can't calculate a DSLR (delay since last SR) to put in a RR without first receiving a SR. The AudioConduit and VideoConduit both log when they send and receive RTP and RTCP, can we turn the logging of WebrtcAudioSessionConduit
and WebrtcVideoSessionConduit
up to DEBUG during the tests in question? We may want to elevate the RTCP DEBUG messages all to INFO, they shouldn't be too noisy.
Comment 44•6 years ago
•
|
||
receiver_only_
appears to always be false here [0] for audio. It may be a clue. The configuration object that ultimately determines its value is created here [1]. I am not sure how to read the comment at [0], I seem to remember that there is a trick by which the RTT can be estimated early.
[0] https://searchfox.org/mozilla-central/source/media/webrtc/trunk/webrtc/modules/rtp_rtcp/source/rtcp_receiver.cc#494
[1] https://searchfox.org/mozilla-central/source/media/webrtc/trunk/webrtc/voice_engine/channel.cc#949
Assignee | ||
Comment 45•6 years ago
|
||
Do we want to extend this timeout to 20 or 25 seconds for now, or do we want to try to get to the bottom of why this is taking so much time?
Assignee | ||
Comment 46•6 years ago
|
||
Assignee | ||
Comment 47•6 years ago
|
||
Comment 48•6 years ago
|
||
(In reply to Byron Campen [:bwc] from comment #45)
Do we want to extend this timeout to 20 or 25 seconds for now, or do we want to try to get to the bottom of why this is taking so much time?
I think this is worth digging into, extending the timeout wouldn't hurt though.
Comment 49•6 years ago
•
|
||
It looks like in bug 1376873 the audio and video conduits were changed to pull the round trip time off of the Call interface. This RTT is a "call wide" metric. I am sure this was done to eliminate a patch that allowed us to get the RTT from within the RTP guts of libwebrtc. This change was also tried in the previous libwebrtc update and the patch had to be restored. I have no idea as to if it worked this time.
Assignee | ||
Comment 50•6 years ago
|
||
test-verify jobs for this test look ok across the board, and I'm doing a bunch of retriggers on win32 debug. It is possible that this doesn't happen unless more of the test suite runs, but it looks promising.
Assignee | ||
Updated•6 years ago
|
Comment 51•6 years ago
|
||
Comment 52•6 years ago
|
||
bugherder |
Comment hidden (Intermittent Failures Robot) |
Comment 54•6 years ago
|
||
bugherder uplift |
Updated•5 years ago
|
Description
•