Open Bug 1407653 Opened 2 years ago Updated 10 days ago

Intermittent TEST-UNEXPECTED-TIMEOUT | dom/media/tests/mochitest/test_peerConnection_videoCodecs.html | application timed out after 330/370 seconds with no output

Categories

(Core :: WebRTC, defect, P5)

defect

Tracking

()

People

(Reporter: intermittent-bug-filer, Assigned: jesup, NeedInfo)

References

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell unknown])

Attachments

(1 file)

this is failing primarily on linux64-debug, but as you can see many other linux platforms- heavily weighted on debug.

here is a recent log:
https://treeherder.mozilla.org/logviewer.html#?repo=autoland&job_id=138271066&lineNumber=11418

and a screenshot:
https://public-artifacts.taskcluster.net/JlQvxTb-Sze-DwwcmV5Vow/0/public/test_info//mozilla-test-fail-screenshot_4iggZO.png

^ interesting tab crashed!


and here is related info from the log (including the first few frames of the crash):
[task 2017-10-19T17:56:28.390Z] 17:56:28     INFO - TEST-PASS | dom/media/tests/mochitest/test_peerConnection_videoCodecs.html | PeerConnectionWrapper (pcRemote) stats inbound video rtp track id {13cd6b44-e78a-4b27-9ba5-ecc37f030f36} - found expected stats 
[task 2017-10-19T17:56:28.392Z] 17:56:28     INFO - TEST-PASS | dom/media/tests/mochitest/test_peerConnection_videoCodecs.html | PeerConnectionWrapper (pcRemote) stats inbound video rtp track id {13cd6b44-e78a-4b27-9ba5-ecc37f030f36} - did not find extra stats with wrong direction 
[task 2017-10-19T17:56:28.392Z] 17:56:28     INFO - TEST-PASS | dom/media/tests/mochitest/test_peerConnection_videoCodecs.html | PeerConnectionWrapper (pcRemote) stats inbound video rtp track id {13cd6b44-e78a-4b27-9ba5-ecc37f030f36} - did not find extra stats with wrong media type 
[task 2017-10-19T17:56:28.393Z] 17:56:28     INFO - Run step 54: PC_LOCAL_VERIFY_SDP_AFTER_END_OF_TRICKLE
[task 2017-10-19T17:56:28.395Z] 17:56:28     INFO - EOC-SDP: {"type":"offer","sdp":"v=0\r\no=mozilla...THIS_IS_SDPARTA-58.0a1 5880292688892352699 0 IN IP4 0.0.0.0\r\ns=-\r\nt=0 0\r\na=sendrecv\r\na=fingerprint:sha-256 09:36:16:BA:F8:FC:4B:A9:CF:CD:23:BA:96:87:0B:39:93:87:00:4D:9A:85:52:51:0C:45:83:81:32:65:74:9C\r\na=group:BUNDLE sdparta_0\r\na=ice-options:trickle\r\na=msid-semantic:WMS *\r\nm=video 44964 UDP/TLS/RTP/SAVPF 97\r\nc=IN IP4 172.17.0.4\r\na=candidate:0 1 UDP 2122252543 172.17.0.4 44964 typ host\r\na=candidate:1 1 TCP 2105524479 172.17.0.4 9 typ host tcptype active\r\na=candidate:0 2 UDP 2122252542 172.17.0.4 47780 typ host\r\na=candidate:1 2 TCP 2105524478 172.17.0.4 9 typ host tcptype active\r\na=sendrecv\r\na=end-of-candidates\r\na=extmap:1 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time\r\na=extmap:2 urn:ietf:params:rtp-hdrext:toffset\r\na=extmap:3 urn:ietf:params:rtp-hdrext:sdes:mid\r\na=fmtp:126 profile-level-id=42e01f;level-asymmetry-allowed=1;packetization-mode=1\r\na=fmtp:97 profile-level-id=42e01f;level-asymmetry-allowed=1\r\na=fmtp:120 max-fs=12288;max-fr=60\r\na=fmtp:121 max-fs=12288;max-fr=60\r\na=ice-pwd:6ae429a0fd8f268fb0763975b880ca06\r\na=ice-ufrag:26cfbc98\r\na=mid:sdparta_0\r\na=msid:{63f9eb01-6490-4743-a304-2c8348866d31} {13cd6b44-e78a-4b27-9ba5-ecc37f030f36}\r\na=rtcp-fb:120 nack\r\na=rtcp-fb:120 nack pli\r\na=rtcp-fb:120 ccm fir\r\na=rtcp-fb:120 goog-remb\r\na=rtcp-fb:121 nack\r\na=rtcp-fb:121 nack pli\r\na=rtcp-fb:121 ccm fir\r\na=rtcp-fb:121 goog-remb\r\na=rtcp-fb:126 nack\r\na=rtcp-fb:126 nack pli\r\na=rtcp-fb:126 ccm fir\r\na=rtcp-fb:126 goog-remb\r\na=rtcp-fb:97 nack\r\na=rtcp-fb:97 nack pli\r\na=rtcp-fb:97 ccm fir\r\na=rtcp-fb:97 goog-remb\r\na=rtcp-mux\r\na=rtpmap:120 VP8/90000\r\na=rtpmap:121 VP9/90000\r\na=rtpmap:126 H264/90000\r\na=rtpmap:97 H264/90000\r\na=setup:actpass\r\na=ssrc:38654134 cname:{f67a03ca-b911-4161-b66c-79270e8a467e}\r\n"}
[task 2017-10-19T17:56:28.396Z] 17:56:28     INFO - TEST-PASS | dom/media/tests/mochitest/test_peerConnection_videoCodecs.html | pcLocal: SDP contains end-of-candidates 
[task 2017-10-19T17:56:28.397Z] 17:56:28     INFO - CLINE-NO-DEFAULT-ADDR-SDP: "v=0\r\no=mozilla...THIS_IS_SDPARTA-58.0a1 5880292688892352699 0 IN IP4 0.0.0.0\r\ns=-\r\nt=0 0\r\na=sendrecv\r\na=fingerprint:sha-256 09:36:16:BA:F8:FC:4B:A9:CF:CD:23:BA:96:87:0B:39:93:87:00:4D:9A:85:52:51:0C:45:83:81:32:65:74:9C\r\na=group:BUNDLE sdparta_0\r\na=ice-options:trickle\r\na=msid-semantic:WMS *\r\nm=video 44964 UDP/TLS/RTP/SAVPF 97\r\nc=IN IP4 172.17.0.4\r\na=candidate:0 1 UDP 2122252543 172.17.0.4 44964 typ host\r\na=candidate:1 1 TCP 2105524479 172.17.0.4 9 typ host tcptype active\r\na=candidate:0 2 UDP 2122252542 172.17.0.4 47780 typ host\r\na=candidate:1 2 TCP 2105524478 172.17.0.4 9 typ host tcptype active\r\na=sendrecv\r\na=end-of-candidates\r\na=extmap:1 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time\r\na=extmap:2 urn:ietf:params:rtp-hdrext:toffset\r\na=extmap:3 urn:ietf:params:rtp-hdrext:sdes:mid\r\na=fmtp:126 profile-level-id=42e01f;level-asymmetry-allowed=1;packetization-mode=1\r\na=fmtp:97 profile-level-id=42e01f;level-asymmetry-allowed=1\r\na=fmtp:120 max-fs=12288;max-fr=60\r\na=fmtp:121 max-fs=12288;max-fr=60\r\na=ice-pwd:6ae429a0fd8f268fb0763975b880ca06\r\na=ice-ufrag:26cfbc98\r\na=mid:sdparta_0\r\na=msid:{63f9eb01-6490-4743-a304-2c8348866d31} {13cd6b44-e78a-4b27-9ba5-ecc37f030f36}\r\na=rtcp-fb:120 nack\r\na=rtcp-fb:120 nack pli\r\na=rtcp-fb:120 ccm fir\r\na=rtcp-fb:120 goog-remb\r\na=rtcp-fb:121 nack\r\na=rtcp-fb:121 nack pli\r\na=rtcp-fb:121 ccm fir\r\na=rtcp-fb:121 goog-remb\r\na=rtcp-fb:126 nack\r\na=rtcp-fb:126 nack pli\r\na=rtcp-fb:126 ccm fir\r\na=rtcp-fb:126 goog-remb\r\na=rtcp-fb:97 nack\r\na=rtcp-fb:97 nack pli\r\na=rtcp-fb:97 ccm fir\r\na=rtcp-fb:97 goog-remb\r\na=rtcp-mux\r\na=rtpmap:120 VP8/90000\r\na=rtpmap:121 VP9/90000\r\na=rtpmap:126 H264/90000\r\na=rtpmap:97 H264/90000\r\na=setup:actpass\r\na=ssrc:38654134 cname:{f67a03ca-b911-4161-b66c-79270e8a467e}\r\n"
[task 2017-10-19T17:56:28.398Z] 17:56:28     INFO - TEST-PASS | dom/media/tests/mochitest/test_peerConnection_videoCodecs.html | pcLocal: SDP contains non-zero IP c line 
[task 2017-10-19T17:56:28.399Z] 17:56:28     INFO - TEST-PASS | dom/media/tests/mochitest/test_peerConnection_videoCodecs.html | pcLocal: SDP contains rtcp-mux 
[task 2017-10-19T17:56:28.400Z] 17:56:28     INFO - TEST-PASS | dom/media/tests/mochitest/test_peerConnection_videoCodecs.html | pcLocal: SDP contains a=ssrc 
[task 2017-10-19T17:56:28.400Z] 17:56:28     INFO - Run step 55: PC_REMOTE_VERIFY_SDP_AFTER_END_OF_TRICKLE
[task 2017-10-19T17:56:28.402Z] 17:56:28     INFO - EOC-SDP: {"type":"answer","sdp":"v=0\r\no=mozilla...THIS_IS_SDPARTA-58.0a1 150700722668149139 0 IN IP4 0.0.0.0\r\ns=-\r\nt=0 0\r\na=sendrecv\r\na=fingerprint:sha-256 5A:8D:E3:C2:A6:5A:40:55:A1:E8:68:A7:C9:19:DA:93:65:92:9E:44:C3:20:DC:F3:CA:6E:10:92:35:CA:E8:80\r\na=group:BUNDLE sdparta_0\r\na=ice-options:trickle\r\na=msid-semantic:WMS *\r\nm=video 34529 UDP/TLS/RTP/SAVPF 97\r\nc=IN IP4 172.17.0.4\r\na=candidate:0 1 UDP 2122252543 172.17.0.4 34529 typ host\r\na=candidate:1 1 TCP 2105524479 172.17.0.4 9 typ host tcptype active\r\na=recvonly\r\na=end-of-candidates\r\na=extmap:1 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time\r\na=extmap:2 urn:ietf:params:rtp-hdrext:toffset\r\na=extmap:3 urn:ietf:params:rtp-hdrext:sdes:mid\r\na=fmtp:97 profile-level-id=42e01f;level-asymmetry-allowed=1\r\na=ice-pwd:cf326303cec86faaf2692da4cea261fb\r\na=ice-ufrag:37e4dc7b\r\na=mid:sdparta_0\r\na=rtcp-fb:97 nack\r\na=rtcp-fb:97 nack pli\r\na=rtcp-fb:97 ccm fir\r\na=rtcp-fb:97 goog-remb\r\na=rtcp-mux\r\na=rtpmap:97 H264/90000\r\na=setup:active\r\na=ssrc:3628860352 cname:{196a0184-f71b-4cf4-9082-becbe818533e}\r\n"}
[task 2017-10-19T17:56:28.403Z] 17:56:28     INFO - TEST-PASS | dom/media/tests/mochitest/test_peerConnection_videoCodecs.html | pcRemote: SDP contains end-of-candidates 
[task 2017-10-19T17:56:28.404Z] 17:56:28     INFO - CLINE-NO-DEFAULT-ADDR-SDP: "v=0\r\no=mozilla...THIS_IS_SDPARTA-58.0a1 150700722668149139 0 IN IP4 0.0.0.0\r\ns=-\r\nt=0 0\r\na=sendrecv\r\na=fingerprint:sha-256 5A:8D:E3:C2:A6:5A:40:55:A1:E8:68:A7:C9:19:DA:93:65:92:9E:44:C3:20:DC:F3:CA:6E:10:92:35:CA:E8:80\r\na=group:BUNDLE sdparta_0\r\na=ice-options:trickle\r\na=msid-semantic:WMS *\r\nm=video 34529 UDP/TLS/RTP/SAVPF 97\r\nc=IN IP4 172.17.0.4\r\na=candidate:0 1 UDP 2122252543 172.17.0.4 34529 typ host\r\na=candidate:1 1 TCP 2105524479 172.17.0.4 9 typ host tcptype active\r\na=recvonly\r\na=end-of-candidates\r\na=extmap:1 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time\r\na=extmap:2 urn:ietf:params:rtp-hdrext:toffset\r\na=extmap:3 urn:ietf:params:rtp-hdrext:sdes:mid\r\na=fmtp:97 profile-level-id=42e01f;level-asymmetry-allowed=1\r\na=ice-pwd:cf326303cec86faaf2692da4cea261fb\r\na=ice-ufrag:37e4dc7b\r\na=mid:sdparta_0\r\na=rtcp-fb:97 nack\r\na=rtcp-fb:97 nack pli\r\na=rtcp-fb:97 ccm fir\r\na=rtcp-fb:97 goog-remb\r\na=rtcp-mux\r\na=rtpmap:97 H264/90000\r\na=setup:active\r\na=ssrc:3628860352 cname:{196a0184-f71b-4cf4-9082-becbe818533e}\r\n"
[task 2017-10-19T17:56:28.405Z] 17:56:28     INFO - TEST-PASS | dom/media/tests/mochitest/test_peerConnection_videoCodecs.html | pcRemote: SDP contains non-zero IP c line 
[task 2017-10-19T17:56:28.406Z] 17:56:28     INFO - TEST-PASS | dom/media/tests/mochitest/test_peerConnection_videoCodecs.html | pcRemote: SDP contains rtcp-mux 
[task 2017-10-19T17:56:28.407Z] 17:56:28     INFO - TEST-PASS | dom/media/tests/mochitest/test_peerConnection_videoCodecs.html | pcRemote: SDP contains a=ssrc 
[task 2017-10-19T17:56:28.407Z] 17:56:28     INFO - Run step 56: CHECK_VIDEO_FLOW
[task 2017-10-19T17:56:28.408Z] 17:56:28     INFO - Checking that we have a frame, got [0,255,0,255]. Ref=[0,0,0,255]. Threshold=16. Pass=true
[task 2017-10-19T17:56:28.409Z] 17:56:28     INFO - Buffered messages finished
[task 2017-10-19T17:56:28.410Z] 17:56:28    ERROR - TEST-UNEXPECTED-TIMEOUT | dom/media/tests/mochitest/test_peerConnection_videoCodecs.html | application timed out after 330 seconds with no output
[task 2017-10-19T17:56:28.411Z] 17:56:28    ERROR - Force-terminating active process(es).
[task 2017-10-19T17:56:28.412Z] 17:56:28     INFO - Determining child pids from psutil
[task 2017-10-19T17:56:28.412Z] 17:56:28     INFO - Found child pids: [1204]
[task 2017-10-19T17:56:28.413Z] 17:56:28     INFO - Killing process: 1204
[task 2017-10-19T17:56:28.414Z] 17:56:28     INFO - TEST-INFO | started process screentopng
[task 2017-10-19T17:56:28.478Z] 17:56:28     INFO - TEST-INFO | screentopng: exit 0
[task 2017-10-19T17:56:28.698Z] 17:56:28     INFO - psutil found pid 1204 dead
[task 2017-10-19T17:56:28.698Z] 17:56:28     INFO - Killing process: 1101
[task 2017-10-19T17:56:28.699Z] 17:56:28     INFO - Not taking screenshot here: see the one that was previously logged
[task 2017-10-19T17:56:28.858Z] 17:56:28     INFO - TEST-INFO | Main app process: exit 6
[task 2017-10-19T17:56:28.858Z] 17:56:28     INFO - Buffered messages finished
[task 2017-10-19T17:56:28.858Z] 17:56:28    ERROR - TEST-UNEXPECTED-FAIL | dom/media/tests/mochitest/test_peerConnection_videoCodecs.html | application terminated with exit code 6
[task 2017-10-19T17:56:28.860Z] 17:56:28     INFO - runtests.py | Application ran for: 0:16:05.488937
[task 2017-10-19T17:56:28.861Z] 17:56:28     INFO - zombiecheck | Reading PID log: /tmp/tmp4JZeYbpidlog
[task 2017-10-19T17:56:28.862Z] 17:56:28     INFO - ==> process 1101 launched child process 1123
[task 2017-10-19T17:56:28.864Z] 17:56:28     INFO - ==> process 1101 launched child process 1152
[task 2017-10-19T17:56:28.864Z] 17:56:28     INFO - ==> process 1101 launched child process 1204
[task 2017-10-19T17:56:28.865Z] 17:56:28     INFO - ==> process 1101 launched child process 1729
[task 2017-10-19T17:56:28.866Z] 17:56:28     INFO - ==> process 1101 launched child process 1745
[task 2017-10-19T17:56:28.866Z] 17:56:28     INFO - ==> process 1101 launched child process 1749
[task 2017-10-19T17:56:28.867Z] 17:56:28     INFO - ==> process 1101 launched child process 1767
[task 2017-10-19T17:56:28.867Z] 17:56:28     INFO - zombiecheck | Checking for orphan process with PID: 1123
[task 2017-10-19T17:56:28.868Z] 17:56:28     INFO - zombiecheck | Checking for orphan process with PID: 1152
[task 2017-10-19T17:56:28.869Z] 17:56:28     INFO - zombiecheck | Checking for orphan process with PID: 1204
[task 2017-10-19T17:56:28.870Z] 17:56:28     INFO - zombiecheck | Checking for orphan process with PID: 1729
[task 2017-10-19T17:56:28.871Z] 17:56:28     INFO - zombiecheck | Checking for orphan process with PID: 1745
[task 2017-10-19T17:56:28.872Z] 17:56:28     INFO - zombiecheck | Checking for orphan process with PID: 1749
[task 2017-10-19T17:56:28.873Z] 17:56:28     INFO - zombiecheck | Checking for orphan process with PID: 1767
[task 2017-10-19T17:56:28.874Z] 17:56:28     INFO - mozcrash Copy/paste: /usr/local/bin/linux64-minidump_stackwalk /tmp/tmpgA6udI.mozrunner/minidumps/54037fe2-5c32-f354-6112-9b0913652a10.dmp /builds/worker/workspace/build/symbols
[task 2017-10-19T17:56:38.332Z] 17:56:38     INFO - mozcrash Saved minidump as /builds/worker/workspace/build/blobber_upload_dir/54037fe2-5c32-f354-6112-9b0913652a10.dmp
[task 2017-10-19T17:56:38.333Z] 17:56:38     INFO - mozcrash Saved app info as /builds/worker/workspace/build/blobber_upload_dir/54037fe2-5c32-f354-6112-9b0913652a10.extra
[task 2017-10-19T17:56:38.396Z] 17:56:38     INFO - PROCESS-CRASH | dom/media/tests/mochitest/test_peerConnection_videoCodecs.html | application crashed [@ nsDocumentViewer::AddRef]
[task 2017-10-19T17:56:38.397Z] 17:56:38     INFO - Crash dump filename: /tmp/tmpgA6udI.mozrunner/minidumps/54037fe2-5c32-f354-6112-9b0913652a10.dmp
[task 2017-10-19T17:56:38.397Z] 17:56:38     INFO - Operating system: Linux
[task 2017-10-19T17:56:38.398Z] 17:56:38     INFO -                   0.0.0 Linux 3.13.0-112-generic #159-Ubuntu SMP Fri Mar 3 15:26:07 UTC 2017 x86_64
[task 2017-10-19T17:56:38.399Z] 17:56:38     INFO - CPU: amd64
[task 2017-10-19T17:56:38.400Z] 17:56:38     INFO -      family 6 model 62 stepping 4
[task 2017-10-19T17:56:38.401Z] 17:56:38     INFO -      2 CPUs
[task 2017-10-19T17:56:38.402Z] 17:56:38     INFO - 
[task 2017-10-19T17:56:38.402Z] 17:56:38     INFO - GPU: UNKNOWN
[task 2017-10-19T17:56:38.403Z] 17:56:38     INFO - 
[task 2017-10-19T17:56:38.404Z] 17:56:38     INFO - Crash reason:  SIGABRT
[task 2017-10-19T17:56:38.404Z] 17:56:38     INFO - Crash address: 0x3e8000003e1
[task 2017-10-19T17:56:38.405Z] 17:56:38     INFO - Process uptime: not available
[task 2017-10-19T17:56:38.406Z] 17:56:38     INFO - 
[task 2017-10-19T17:56:38.406Z] 17:56:38     INFO - Thread 0 (crashed)
[task 2017-10-19T17:56:38.407Z] 17:56:38     INFO -  0  libxul.so!nsDocumentViewer::AddRef [nsISupportsImpl.h:543112d83365 : 311 + 0x4]
[task 2017-10-19T17:56:38.408Z] 17:56:38     INFO -     rax = 0x0000000000000001   rdx = 0x00007f45c8c23564
[task 2017-10-19T17:56:38.409Z] 17:56:38     INFO -     rcx = 0x0000000000000110   rbx = 0x0000000000000002
[task 2017-10-19T17:56:38.409Z] 17:56:38     INFO -     rsi = 0x0000000000000001   rdi = 0x00007f45ab938040
[task 2017-10-19T17:56:38.410Z] 17:56:38     INFO -     rbp = 0x00007ffd90fb3c70   rsp = 0x00007ffd90fb3c60
[task 2017-10-19T17:56:38.411Z] 17:56:38     INFO -      r8 = 0x00000000c61eac14    r9 = 0x0000000044815f7a
[task 2017-10-19T17:56:38.412Z] 17:56:38     INFO -     r10 = 0x0000000000000000   r11 = 0x00007f45d5c88f50
[task 2017-10-19T17:56:38.412Z] 17:56:38     INFO -     r12 = 0x00007f45ab938040   r13 = 0x00007f45ad04e9a0
[task 2017-10-19T17:56:38.413Z] 17:56:38     INFO -     r14 = 0x0000000000000000   r15 = 0x00007f45c6e2e0e0
[task 2017-10-19T17:56:38.414Z] 17:56:38     INFO -     rip = 0x00007f45c638266d
[task 2017-10-19T17:56:38.414Z] 17:56:38     INFO -     Found by: given as instruction pointer in context
[task 2017-10-19T17:56:38.415Z] 17:56:38     INFO -  1  libxul.so!nsDocShell::GetContentViewer [nsISupportsUtils.h:543112d83365 : 42 + 0x6]
[task 2017-10-19T17:56:38.416Z] 17:56:38     INFO -     rbx = 0x0000000000000000   rbp = 0x00007ffd90fb3c80
[task 2017-10-19T17:56:38.416Z] 17:56:38     INFO -     rsp = 0x00007ffd90fb3c80   r12 = 0x00007ffd90fb3cf8
[task 2017-10-19T17:56:38.417Z] 17:56:38     INFO -     r13 = 0x00007f45ad04e9a0   r14 = 0x0000000000000000
[task 2017-10-19T17:56:38.418Z] 17:56:38     INFO -     r15 = 0x00007f45c6e2e0e0   rip = 0x00007f45c6e2e0dc
[task 2017-10-19T17:56:38.419Z] 17:56:38     INFO -     Found by: call frame info
[task 2017-10-19T17:56:38.419Z] 17:56:38     INFO -  2  libxul.so!MarkDocShell [nsCCUncollectableMarker.cpp:543112d83365 : 276 + 0xe]
[task 2017-10-19T17:56:38.420Z] 17:56:38     INFO -     rbx = 0x0000000000000000   rbp = 0x00007ffd90fb3d30
[task 2017-10-19T17:56:38.421Z] 17:56:38     INFO -     rsp = 0x00007ffd90fb3c90   r12 = 0x00007ffd90fb3cf8
[task 2017-10-19T17:56:38.421Z] 17:56:38     INFO -     r13 = 0x00007f45ad04e9a0   r14 = 0x0000000000000000
[task 2017-10-19T17:56:38.422Z] 17:56:38     INFO -     r15 = 0x00007f45c6e2e0e0   rip = 0x00007f45c51d4aa2
[task 2017-10-19T17:56:38.423Z] 17:56:38     INFO -     Found by: call frame info
[task 2017-10-19T17:56:38.424Z] 17:56:38     INFO -  3  libxul.so!MarkWindowList [nsCCUncollectableMarker.cpp:543112d83365 : 312 + 0x11]
[task 2017-10-19T17:56:38.424Z] 17:56:38     INFO -     rbx = 0x00007ffd90fb3d98   rbp = 0x00007ffd90fb3dd0
[task 2017-10-19T17:56:38.425Z] 17:56:38     INFO -     rsp = 0x00007ffd90fb3d40   r12 = 0x00007ffd90fb3d80
[task 2017-10-19T17:56:38.426Z] 17:56:38     INFO -     r13 = 0x00007f45ad04e9a0   r14 = 0x00007ffd90fb3e08
[task 2017-10-19T17:56:38.426Z] 17:56:38     INFO -     r15 = 0x00007f459b62fbc0   rip = 0x00007f45c51d53d2
[task 2017-10-19T17:56:38.427Z] 17:56:38     INFO -     Found by: call frame info



:drno, is this something you could help out with?
Flags: needinfo?(drno)
Whiteboard: [stockwell needswork]
Without having digged too deep I'm pretty sure the problem here is that we actually don't have the OpenH264 extension available in automation. Therefore this test can't really succeed. Not sure I understand how this could ever have passed.
Depends on: 1395853
Flags: needinfo?(drno)
There's something wonky going on here. We're crashing on 0x3e8000003e1 when trying to increment a refcount (not an aligned address, so something overwrote the memory the pointer is stored in, possibly a UAF). Are we poisoning on this platform?
Looks like your test uncovered something nasty!
Flags: needinfo?(apehrson)
We do have a fake openh264 codec in automation; it returns an average of all the pixels in the image.  Very high compression.  :-)

>https://public-artifacts.taskcluster.net/bMoAVzOTRiCqLLIcaqIQfQ/0/public/logs/live_backing.log
> 
>https://public-artifacts.taskcluster.net/RbX1bX9UQaWKfPyG0HNlyA/0/public/logs/live_backing.log

I see timeouts there, the stacks therefor probably aren't interesting.

And the original report is also a timeout; so the abort it forced is not really a crash on a bad address.
Is treeherder filing intermittents as P5's??  that might stop triagers from looking at them.
(In reply to Randell Jesup [:jesup] from comment #8)
> I see timeouts there, the stacks therefor probably aren't interesting.
> 
> And the original report is also a timeout; so the abort it forced is not
> really a crash on a bad address.

That assumes the automation properly detects GMP plugin crashes. I'm not so optimistic that is always the case.
Here is the timeout Randel is referring to:

[task 2017-10-23T21:08:51.148Z] 21:08:51     INFO - Run step 56: CHECK_VIDEO_FLOW
[task 2017-10-23T21:08:51.149Z] 21:08:51     INFO - Checking that we have a frame, got [0,255,0,255]. Ref=[0,0,0,255]. Threshold=16. Pass=true
[task 2017-10-23T21:08:51.150Z] 21:08:51     INFO - Checking playing, [0,255,0,255] vs [0,255,0,255]. Threshold=16 Pass=false
[task 2017-10-23T21:08:51.151Z] 21:08:51     INFO - Buffered messages finished
[task 2017-10-23T21:08:51.152Z] 21:08:51    ERROR - TEST-UNEXPECTED-TIMEOUT | dom/media/tests/mochitest/test_peerConnection_videoCodecs.html | application timed out after 330 seconds with no output
[task 2017-10-23T21:08:51.153Z] 21:08:51    ERROR - Force-terminating active process(es).
[task 2017-10-23T21:08:51.154Z] 21:08:51     INFO - Determining child pids from psutil
[task 2017-10-23T21:08:51.155Z] 21:08:51     INFO - Found child pids: [1279]
[task 2017-10-23T21:08:51.156Z] 21:08:51     INFO - Killing process: 1279

That is the pattern I noticed before already.

And here is the stack trace from the crash which happened several minutes before:

[task 2017-10-23T20:56:46.578Z] 20:56:46     INFO - GECKO(1182) | [Socket Thread]: I/mtransport Flow[a81a2ef89831cf67:0,rtp(none)]; Layer[dtls]: ****** SSL handshake completed ******
[task 2017-10-23T20:56:46.579Z] 20:56:46     INFO - GECKO(1182) | [Socket Thread]: I/mtransport Flow[a81a2ef89831cf67:0,rtp(none)]; Layer[dtls]: Selected ALPN string: webrtc
[task 2017-10-23T20:56:46.581Z] 20:56:46     INFO - GECKO(1182) | [Socket Thread]: I/signaling [Socket Thread|MediaPipeline] MediaPipeline.cpp:779: Flow is ready
[task 2017-10-23T20:56:46.583Z] 20:56:46     INFO - GECKO(1182) | [Socket Thread]: I/signaling [Socket Thread|MediaPipeline] MediaPipeline.cpp:817: Transport ready for pipeline 0xdff43090 flow a81a2ef89831cf67| Receive video[{66990ae8-903c-4d0d-8416-ef1b4787a162}]: RTP/RTCP mux
[task 2017-10-23T20:56:46.585Z] 20:56:46     INFO - GECKO(1182) | [Socket Thread]: I/signaling [Socket Thread|MediaPipeline] MediaPipeline.cpp:891: Listening for RTP/RTCP mux packets received on 0xdf83e320
[task 2017-10-23T20:56:46.652Z] 20:56:46     INFO - GECKO(1182) | ### XPCOM_MEM_BLOAT_LOG defined -- logging bloat/leaks to /tmp/tmpdZ74Cf.mozrunner/runtests_leaks_geckomediaplugin_pid1905.log
[task 2017-10-23T20:56:46.656Z] 20:56:46     INFO - GECKO(1182) | [GMP 1905, Main Thread] WARNING: NS_ENSURE_TRUE(InitStaticMembers()) failed: file /builds/worker/workspace/build/src/modules/libpref/Preferences.cpp, line 4903
[task 2017-10-23T20:56:47.831Z] 20:56:47     INFO - GECKO(1182) | #
[task 2017-10-23T20:56:47.831Z] 20:56:47     INFO - GECKO(1182) | # Fatal error in /builds/worker/workspace/build/src/media/webrtc/trunk/webrtc/modules/rtp_rtcp/source/rtp_format_h264.cc, line 250
[task 2017-10-23T20:56:47.831Z] 20:56:47     INFO - GECKO(1182) | # last system error: 0
[task 2017-10-23T20:56:47.832Z] 20:56:47     INFO - GECKO(1182) | # Check failed: payload_size_left >= fragment->length (1164 vs. 1694)
[task 2017-10-23T20:56:47.833Z] 20:56:47     INFO - GECKO(1182) | # Payload size left 1164, fragment length 1694, packetization mode SingleNalUnit
[task 2017-10-23T20:56:47.835Z] 20:56:47     INFO - GECKO(1182) | #
[task 2017-10-23T20:56:47.837Z] 20:56:47     INFO - GECKO(1182) | ==== C stack trace ===============================
aha... looking at the test - I should note that the fake H264 codec is VERY fake.  This is showing Single NAL Unit packetization (RFC 6184 6.2) aka Mode 0.  The fake codec was originally used in mode 1 (our preference, and the preference of almost every service), and has never been tested in mode 0.  Probably the "BIG_FRAME" code is messing up by generating "NAL"s that are larger than can be packetized in a single packet, which is how Single NAL Unit works (configure for max NAL size of less than what fits in a packet).

See dom/media/gmp-plugin-openh264/gmp-fake-openh264.cpp
Should we stop testing mode 0 with the fake codec?

What would it take to get the real codec in automation? (if you have thought about this)
Flags: needinfo?(apehrson) → needinfo?(rjesup)
I looked at one occasion, [1]. This is not a timeout per se (test starts at 15:53:09 and last js log was buffered at 15:53:21). It looks like the content process crash goes undetected and the test runner times out.

Stack looks reasonable (same assert as Byron saw in comment 6):
> Thread 25 (crashed)
>  0  firefox!mozalloc_abort [mozalloc_abort.cpp:ea486c5d3380 : 33 + 0x5]
>  1  firefox!abort [mozalloc_abort.cpp:ea486c5d3380 : 80 + 0x5]
>  2  libxul.so!rtc::FatalMessage::~FatalMessage [checks.cc:ea486c5d3380 : 109 + 0x5]
>  3  libxul.so!webrtc::RtpPacketizerH264::PacketizeSingleNalu [rtp_format_h264.cc:ea486c5d3380 : 253 + 0x1f]
>  4  libxul.so!webrtc::RtpPacketizerH264::GeneratePackets [rtp_format_h264.cc:ea486c5d3380 : 173 + 0xb]
>  5  libxul.so!webrtc::RtpPacketizerH264::SetPayloadData [rtp_format_h264.cc:ea486c5d3380 : 166 + 0x8]
>  6  libxul.so!webrtc::RTPSenderVideo::SendVideo [rtp_sender_video.cc:ea486c5d3380 : 367 + 0x3]
>  7  libxul.so!webrtc::RTPSender::SendOutgoingData [rtp_sender.cc:ea486c5d3380 : 464 + 0x3d]
>  8  libxul.so!webrtc::ModuleRtpRtcpImpl::SendOutgoingData [rtp_rtcp_impl.cc:ea486c5d3380 : 419 + 0x1a]
>  9  libxul.so!webrtc::PayloadRouter::OnEncodedImage [payload_router.cc:ea486c5d3380 : 140 + 0x1b]
> 10  libxul.so!webrtc::internal::VideoSendStreamImpl::OnEncodedImage [video_send_stream.cc:ea486c5d3380 : 1033 + 0x16]
> 11  libxul.so!webrtc::ViEEncoder::OnEncodedImage [vie_encoder.cc:ea486c5d3380 : 610 + 0x13]
> 12  libxul.so!webrtc::VCMEncodedFrameCallback::OnEncodedImage [generic_encoder.cc:ea486c5d3380 : 163 + 0xf]
> 13  libxul.so!mozilla::WebrtcGmpVideoEncoder::Encoded [WebrtcGmpVideoCodec.cpp:ea486c5d3380 : 646 + 0x6]
> 14  libxul.so!mozilla::gmp::GMPVideoEncoderParent::RecvEncoded [GMPVideoEncoderParent.cpp:ea486c5d3380 : 263 + 0x11]
> 15  libxul.so!mozilla::gmp::PGMPVideoEncoderParent::OnMessageReceived [PGMPVideoEncoderParent.cpp: : 294 + 0x6]
> 16  libxul.so!mozilla::ipc::MessageChannel::DispatchAsyncMessage [MessageChannel.cpp:ea486c5d3380 : 2119 + 0x6]


[1] https://public-artifacts.taskcluster.net/YXpurNAaTuG1tu2nnix-Yg/0/public/logs/live_backing.log
:rjesup, new intermittent bugs are filed as P5, we adjust the priority when we hit 30 failures- therefore all triage should only be seeing high frequency intermittent bugs.
:drno, do you have any updates on this?
Flags: needinfo?(drno)
There are 39 failures in the last 7 days on Linux occurring on debug, opt and pgo build types. 
Failing test suit: mochitest-media-e10s.

Link to a recent log: https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=143330887&lineNumber=13362

Part of a log:
task 2017-11-09T15:16:22.735Z] 15:16:22     INFO - TEST-PASS | dom/media/tests/mochitest/test_peerConnection_videoCodecs.html | RTP flowing for video track {bb129eda-d511-4898-9410-30997cb2a85f} 
13361
[task 2017-11-09T15:16:22.736Z] 15:16:22     INFO - Buffered messages finished
13362
[task 2017-11-09T15:16:22.737Z] 15:16:22    ERROR - TEST-UNEXPECTED-TIMEOUT | dom/media/tests/mochitest/test_peerConnection_videoCodecs.html | application timed out after 370 seconds with no output
13363
[task 2017-11-09T15:16:22.738Z] 15:16:22    ERROR - Force-terminating active process(es).
13364
[task 2017-11-09T15:16:22.739Z] 15:16:22     INFO - Determining child pids from psutil
13365
[task 2017-11-09T15:16:22.740Z] 15:16:22     INFO - Found child pids: [1188]
13366
[task 2017-11-09T15:16:22.741Z] 15:16:22     INFO - Killing process: 1188
13367
[task 2017-11-09T15:16:22.742Z] 15:16:22     INFO - TEST-INFO | started process screentopng
13368
[task 2017-11-09T15:16:22.743Z] 15:16:22     INFO - TEST-INFO | screentopng: exit 0
13369
[task 2017-11-09T15:16:22.808Z] 15:16:22     INFO - psutil found pid 1188 dead
13370
[task 2017-11-09T15:16:22.809Z] 15:16:22     INFO - Killing process: 1084
13371
[task 2017-11-09T15:16:22.809Z] 15:16:22     INFO - Not taking screenshot here: see the one that was previously logged
13372
[task 2017-11-09T15:16:22.930Z] 15:16:22     INFO - TEST-INFO | Main app process: exit 6
13373
[task 2017-11-09T15:16:22.931Z] 15:16:22     INFO - Buffered messages finished 

:pehrsons 
I saw that you are working on both this bug and 1395853, which is blocking this one.
Do you have any new updates?
Flags: needinfo?(apehrson)
So the crash happens when Andreas new test actually removes all other codecs besides, except H264 in mode 0. As Randel pointed out that the H264 fake plugin only supports mode 1, the question is now:

Do we change the mochitest to not test H264 in mode 0, or do we teach the fake plugin to support mode 0?

I assume the first is the easier path forward here. Randell, Andreas what are your thoughts on this?
Flags: needinfo?(drno)
(In reply to Nils Ohlmeier [:drno] from comment #20)
> So the crash happens when Andreas new test actually removes all other codecs
> besides, except H264 in mode 0. As Randel pointed out that the H264 fake
> plugin only supports mode 1, the question is now:
> 
> Do we change the mochitest to not test H264 in mode 0, or do we teach the
> fake plugin to support mode 0?

The former is what we used to have. Bug 1399137 tries to achieve the latter.

The problem seems to be that either bug 1399137 doesn't cover all the bases, or our stack doesn't support some edge cases touched by the fake codec.

I think you know better than me which of these is the case though. From comment 14 one of the cases hit was [1], a fragment of length 0.


[1] https://searchfox.org/mozilla-central/rev/fe75164c55321e011d9d13b6d05c1e00d0a640be/media/webrtc/trunk/webrtc/modules/rtp_rtcp/source/rtp_format_h264.cc#253
Flags: needinfo?(apehrson) → needinfo?(drno)
There are 30 failures related to this bug in the last 7 days on Linux mostly on debug mochitests.

:drno any updates on this?
Duplicate of this bug: 1424055
Most of the time this is due to the problem described in comment 14, but I'm occasionally seeing this error:

[task 2018-02-20T20:09:19.973Z] 20:09:19     INFO - GECKO(1046) | # Fatal error in /builds/worker/workspace/build/src/media/webrtc/trunk/webrtc/modules/rtp_rtcp/source/rtp_format_h264.cc, line 250
[task 2018-02-20T20:09:19.973Z] 20:09:19     INFO - GECKO(1046) | # last system error: 0
[task 2018-02-20T20:09:19.973Z] 20:09:19     INFO - GECKO(1046) | # Check failed: payload_size_left >= fragment->length (1164 vs. 1682)
[task 2018-02-20T20:09:19.974Z] 20:09:19     INFO - GECKO(1046) | # Payload size left 1164, fragment length 1682, packetization mode SingleNalUnit

https://searchfox.org/mozilla-central/source/media/webrtc/trunk/webrtc/modules/rtp_rtcp/source/rtp_format_h264.cc#250
Assignee: nobody → docfaraday
I wonder if we're somehow getting some frames that the fake encoder "encoded" for packetization mode 1 (with a larger max size) after we switch over to packetization mode 0? Let's see what happens if we test mode 0, then mode 1 instead of the other way around...
Switching up the order did not help. The fake encoder probably just doesn't honor frame size limits correctly. Looking into it.
Ok, when I set GMP_LOGGING=3 and run this mochitest, _nothing_ is being successfully decoded:

GECKO(30395) | Info: Initialized encoder
GECKO(30395) | Debug: Encode size=320x240
GECKO(30395) | Debug: Encoding complete. type= 0 NAL_type=7 length=28 timestamp=279716528 width/height=320x240
GECKO(30395) | Debug: Calling callback
GECKO(30395) | Debug: Callback called
GECKO(30395) | Debug: Encoding complete. type= 0 NAL_type=8 length=28 timestamp=279716528 width/height=320x240
GECKO(30395) | Debug: Calling callback
GECKO(30395) | Debug: Callback called
GECKO(30395) | Debug: Encoding complete. type= 0 NAL_type=5 length=9968 timestamp=279716528 width/height=320x240
GECKO(30395) | Debug: Calling callback
GECKO(30395) | Debug: Callback called
GECKO(30395) | Info: InitDecode
GECKO(30395) | Debug: DecodeDecoding frame size=62 timestamp=24229277788
GECKO(30395) | Debug: magic=1179795789 h264_compat=7 width=320 height=1092296944 timestamp=24229277788 y/u/v=16:141:0
GECKO(30395) | Error: Couldn't decode frame. Size=62
GECKO(30395) | Debug: Encode size=320x240
GECKO(30395) | Debug: Encoding complete. type= 0 NAL_type=7 length=28 timestamp=279747528 width/height=320x240
GECKO(30395) | Debug: Calling callback
GECKO(30395) | Debug: Callback called
GECKO(30395) | Debug: Encoding complete. type= 0 NAL_type=8 length=28 timestamp=279747528 width/height=320x240
GECKO(30395) | Debug: Calling callback
GECKO(30395) | Debug: Callback called
GECKO(30395) | Debug: Encoding complete. type= 0 NAL_type=5 length=9968 timestamp=279747528 width/height=320x240
GECKO(30395) | Debug: Calling callback
GECKO(30395) | Debug: Callback called
GECKO(30395) | Debug: Encode size=320x240
GECKO(30395) | Debug: Encoding complete. type= 1 NAL_type=1 length=28 timestamp=279778528 width/height=320x240
GECKO(30395) | Debug: Calling callback
GECKO(30395) | Debug: Callback called
GECKO(30395) | Debug: DecodeDecoding frame size=62 timestamp=24229308788
GECKO(30395) | Debug: magic=1179795789 h264_compat=7 width=320 height=1092296944 timestamp=24229308788 y/u/v=16:141:0
GECKO(30395) | Error: Couldn't decode frame. Size=62


... and so forth.
Here's some of the GMP output for mode 0. It is even worse.

GECKO(31005) | Debug: Encoding complete. type= 0 NAL_type=7 length=28 timestamp=1035302528 width/height=320x240
GECKO(31005) | Debug: Calling callback
GECKO(31005) | Debug: Callback called
GECKO(31005) | Debug: Encoding complete. type= 0 NAL_type=8 length=28 timestamp=1035302528 width/height=320x240
GECKO(31005) | Debug: Calling callback
GECKO(31005) | Debug: Callback called
GECKO(31005) | Debug: Encoding complete. type= 0 NAL_type=5 length=1168 timestamp=1035302528 width/height=320x240
GECKO(31005) | Debug: Calling callback
GECKO(31005) | Debug: Callback called
GECKO(31005) | Debug: DecodeDecoding frame size=90 timestamp=7429041744
GECKO(31005) | Debug: magic=1179795789 h264_compat=7 width=320 height=1092296944 timestamp=7429041744 y/u/v=16:141:0
GECKO(31005) | Error: Couldn't decode frame. Size=90
GECKO(31005) | Debug: DecodeDecoding frame size=1140 timestamp=7429041744
GECKO(31005) | Debug: magic=2013790464 h264_compat=192 width=1095605619 height=20989522 timestamp=7429041744 y/u/v=0:0:240
GECKO(31005) | Error: Couldn't decode frame. Size=1140


I think we have two distinct problems here:

1. The fake encoder's SPS is getting mucked with by SpsVuiRewriter::ParseAndRewriteSps here: https://searchfox.org/mozilla-central/source/media/webrtc/trunk/webrtc/modules/rtp_rtcp/source/rtp_format_h264.cc#131

The fake decoder can't cope with this.

2. When FakeVideoEncoder::SendFrame is called with nalu type 5, we encode _two_ distinct frames. The first is essentially an EncodedFrame with a nalu type of 5, otherwise identical to the frames we encode for SPS/PPS. Next, we "encode" a frame that is |frame_size_| large, and filled with uninitialized data! (https://searchfox.org/mozilla-central/source/dom/media/gmp-plugin-openh264/gmp-fake-openh264.cpp#215) The packetizer just passes this through without incident most of the time, but _occasionally_ the first byte just happens to map to SPS, and this relatively large fragment goes through SpsVuiRewriter::ParseAndRewriteSps, bumping it over the size limit, and hitting the assertion here: https://searchfox.org/mozilla-central/source/media/webrtc/trunk/webrtc/modules/rtp_rtcp/source/rtp_format_h264.cc#250
Assignee: docfaraday → rjesup
See Also: → 1375540
Summary: Intermittent TEST-UNEXPECTED-TIMEOUT | dom/media/tests/mochitest/test_peerConnection_videoCodecs.html | application timed out after 330 seconds with no output → Intermittent TEST-UNEXPECTED-TIMEOUT | dom/media/tests/mochitest/test_peerConnection_videoCodecs.html | application timed out after 330/370 seconds with no output
I've been playing with the fake encoder, and there are additional problems. The jitter buffer pays attention to SPS and PPS ids, which the fake encoder does not encode properly.
Depends on: 1448863
Interesting. The rate on this seems to have gone way down. I wonder why...
I've attached a patch I was working on, and it eliminates this intermittent, but it seems that something else has reduced the frequency while I was working on it. Leaving it be for now.
You need to log in before you can comment on or make changes to this bug.