Closed Bug 957800 Opened 6 years ago Closed 6 years ago

Intermittent test_peerConnection_offerRequiresReceiveVideoAudio.html,test_peerConnection_basicAudioVideo.html,test_peerConnection_basicAudioVideoCombined.html,test_peerConnection_offerRequiresReceiveAudio.html | Have localcandidate stat(s)

Categories

(Core :: WebRTC, defect)

x86_64
Linux
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla29
Tracking Status
firefox27 --- unaffected
firefox28 --- unaffected
firefox29 --- fixed
firefox-esr24 --- unaffected

People

(Reporter: RyanVM, Assigned: bwc)

References

Details

(Keywords: intermittent-failure)

Attachments

(1 file, 2 obsolete files)

https://tbpl.mozilla.org/php/getParsedLog.php?id=32718449&tree=B2g-Inbound

Ubuntu VM 12.04 x64 b2g-inbound opt test mochitest-3 on 2014-01-08 13:30:03 PST for push 0449f682dd31
slave: tst-linux64-ec2-127

13:36:03     INFO -  26814 INFO TEST-INFO | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | PeerConnectionWrapper (pcRemote): 'onsignalingstatechange' event fired
13:36:03     INFO -  26815 INFO TEST-INFO | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | PeerConnectionWrapper (pcRemote): 'onsignalingstatechange' event registered for async check
13:36:03     INFO -  26816 INFO TEST-INFO | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | PeerConnectionWrapper (pcRemote): Successfully set the local description
13:36:03     INFO -  26817 INFO TEST-PASS | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | signalingState after remote setLocalDescription is 'stable'
13:36:03     INFO -  26818 INFO TEST-INFO | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | Run step: PC_LOCAL_WAIT_FOR_ICE_CONNECTED
13:36:03     INFO -  26819 INFO TEST-INFO | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | iceConnectionState: connected
13:36:03     INFO -  26820 INFO TEST-PASS | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | pc_local: ICE is in connected state
13:36:03     INFO -  26821 INFO TEST-INFO | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | Run step: PC_REMOTE_WAIT_FOR_ICE_CONNECTED
13:36:03     INFO -  26822 INFO TEST-INFO | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | iceConnectionState: connected
13:36:03     INFO -  26823 INFO TEST-PASS | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | pc_remote: ICE is in connected state
13:36:03     INFO -  26824 INFO TEST-INFO | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | Run step: PC_LOCAL_CHECK_MEDIA_STREAMS
13:36:03     INFO -  26825 INFO TEST-PASS | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | PeerConnectionWrapper (pcLocal) has 0 local streams
13:36:03     INFO -  26826 INFO TEST-PASS | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | PeerConnectionWrapper (pcLocal) has 1 remote streams
13:36:03     INFO -  26827 INFO TEST-INFO | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | Run step: PC_REMOTE_CHECK_MEDIA_STREAMS
13:36:03     INFO -  26828 INFO TEST-PASS | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | PeerConnectionWrapper (pcRemote) has 0 local streams
13:36:03     INFO -  26829 INFO TEST-PASS | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | PeerConnectionWrapper (pcRemote) has 1 remote streams
13:36:03     INFO -  26830 INFO TEST-INFO | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | Run step: PC_LOCAL_CHECK_MEDIA_FLOW_PRESENT
13:36:03     INFO -  26831 INFO TEST-INFO | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | Run step: PC_REMOTE_CHECK_MEDIA_FLOW_PRESENT
13:36:03     INFO -  (ice/ERR) ICE(PC:c1f83b59c365864b): peer (PC:c1f83b59c365864b:default) pairing local trickle ICE candidate srflx(IP4:10.134.57.178:43612|stun.services.mozilla.com:3478)
13:36:03     INFO -  864093952[7fcf30378f20]: [CCAPP Task|cpr] cpr_linux_stdio.c:146: cprGetMessage: msgrcv success for queue CCAPPQ
13:36:03     INFO -  864093952[7fcf30378f20]: [CCAPP Task|def] ccapi.c:1164: SIPCC-CC_API: 1/42, cc_int_feature2: UI -> GSM: FOUNDICECANDIDATE
13:36:03     INFO -  864093952[7fcf30378f20]: [CCAPP Task|cpr] cpr_linux_stdio.c:146: cprGetMessage: msgrcv success for queue CCAPPQ
13:36:03     INFO -  864093952[7fcf30378f20]: [CCAPP Task|def] ccapi.c:1164: SIPCC-CC_API: 1/42, cc_int_feature2: UI -> GSM: FOUNDICECANDIDATE
13:36:03     INFO -  849409792[7fcf30378d00]: [GSM Task|cpr] cpr_linux_stdio.c:146: cprGetMessage: msgrcv success for queue GSMQ
13:36:03     INFO -  849409792[7fcf30378d00]: [GSM Task|def] dcsm.c:532: SIPCC-DCSM: dcsm_process_event: DCSM 26  :(DCSM_READY:FOUNDCANDIDATE )
13:36:03     INFO -  849409792[7fcf30378d00]: [GSM Task|fsm_sm] sm.c:46: SIPCC-FSM: sm_process_event: DEF 42  : 0x7fcf46bbd164x: sm entry: (STABLE:FOUNDCANDIDATE)
13:36:03     INFO -  849409792[7fcf30378d00]: [GSM Task|fsm_sm] fsmdef.c:4208: SIPCC-FSM: fsmdef_ev_foundcandidate: Entered.
13:36:03     INFO -  849409792[7fcf30378d00]: [GSM Task|tnp] ui.c:1797: SIPCC-UI_API: 1/42, ui_ice_candidate_found: state=35 call_instance=0
13:36:03     INFO -  849409792[7fcf30378d00]: [GSM Task|def] sm.c:65: SIPCC-GSM: 1/42, sm_process_event: DEF   :(STABLE:FOUNDCANDIDATE )
13:36:03     INFO -  849409792[7fcf30378d00]: [GSM Task|cpr] cpr_linux_stdio.c:146: cprGetMessage: msgrcv success for queue GSMQ
13:36:03     INFO -  849409792[7fcf30378d00]: [GSM Task|def] dcsm.c:532: SIPCC-DCSM: dcsm_process_event: DCSM 26  :(DCSM_READY:FOUNDCANDIDATE )
13:36:03     INFO -  849409792[7fcf30378d00]: [GSM Task|fsm_sm] sm.c:46: SIPCC-FSM: sm_process_event: DEF 42  : 0x7fcf46bbd164x: sm entry: (STABLE:FOUNDCANDIDATE)
13:36:03     INFO -  849409792[7fcf30378d00]: [GSM Task|fsm_sm] fsmdef.c:4208: SIPCC-FSM: fsmdef_ev_foundcandidate: Entered.
13:36:03     INFO -  849409792[7fcf30378d00]: [GSM Task|tnp] ui.c:1797: SIPCC-UI_API: 1/42, ui_ice_candidate_found: state=35 call_instance=0
13:36:03     INFO -  849409792[7fcf30378d00]: [GSM Task|def] sm.c:65: SIPCC-GSM: 1/42, sm_process_event: DEF   :(STABLE:FOUNDCANDIDATE )
13:36:03     INFO -  864093952[7fcf30378f20]: [CCAPP Task|cpr] cpr_linux_stdio.c:146: cprGetMessage: msgrcv success for queue CCAPPQ
13:36:03     INFO -  864093952[7fcf30378f20]: [CCAPP Task|CC_SIPCCCallInfo] CC_SIPCCCallInfo.cpp:576: State 35 not handled in generateCapabilities()
13:36:03     INFO -  864093952[7fcf30378f20]: [CCAPP Task|CC_SIPCCService] CC_SIPCCService.cpp:724: onCallEvent(CCAPI_CALL_EV_STATE, 1002A, [UNKNOWN|]
13:36:03     INFO -  864093952[7fcf30378f20]: [CCAPP Task|def] ccapi_call.c:57: SIPCC-SIP_CC_PROV: CCAPI_Call_releaseCallInfo: ref=0x7fcf25d76800: count=2
13:36:03     INFO -  864093952[7fcf30378f20]: [CCAPP Task|cpr] cpr_linux_stdio.c:146: cprGetMessage: msgrcv success for queue CCAPPQ
13:36:03     INFO -  864093952[7fcf30378f20]: [CCAPP Task|CC_SIPCCCallInfo] CC_SIPCCCallInfo.cpp:576: State 35 not handled in generateCapabilities()
13:36:03     INFO -  864093952[7fcf30378f20]: [CCAPP Task|CC_SIPCCService] CC_SIPCCService.cpp:724: onCallEvent(CCAPI_CALL_EV_STATE, 1002A, [UNKNOWN|]
13:36:03     INFO -  864093952[7fcf30378f20]: [CCAPP Task|def] ccapi_call.c:57: SIPCC-SIP_CC_PROV: CCAPI_Call_releaseCallInfo: ref=0x7fcf25d76c00: count=2
13:36:03     INFO -  1282434880[7fcf4b23b370]: [main|def] ccapi_call.c:57: SIPCC-SIP_CC_PROV: CCAPI_Call_releaseCallInfo: ref=0x7fcf25d76800: count=1
13:36:03     INFO -  1282434880[7fcf4b23b370]: [main|def] ccapi_call.c:57: SIPCC-SIP_CC_PROV: CCAPI_Call_releaseCallInfo: ref=0x7fcf25d76c00: count=1
13:36:03     INFO -  1282434880[7fcf4b23b370]: [main|PeerConnectionImpl] PeerConnectionImpl.cpp:258: PeerConnectionObserverDispatch processing mCallState = 35 (), mFsmState = 16 (STABLE)
13:36:03     INFO -  1282434880[7fcf4b23b370]: [main|PeerConnectionImpl] PeerConnectionImpl.cpp:258: PeerConnectionObserverDispatch processing mCallState = 35 (), mFsmState = 16 (STABLE)
13:36:03     INFO -  26832 INFO TEST-INFO | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | Run step: PC_LOCAL_CHECK_STATS
13:36:03     INFO -  26833 INFO TEST-INFO | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | PeerConnectionWrapper (pcLocal): Got stats: {"0501":{"id":"0501","timestamp":1389216963732.7678,"type":"remotecandidate","candidateType":"host","ipAddress":"10.134.57.178","portNumber":43612}}
13:36:03     INFO -  26834 INFO TEST-PASS | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | Spec and MapClass variant of RTCStatsReport enumeration agree
13:36:03     INFO -  26835 INFO TEST-PASS | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | Have at least 0 inboundrtp stat(s) *
13:36:03     INFO -  26836 INFO TEST-PASS | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | Have 0 outboundrtp stat(s)
13:36:03     INFO -  26837 ERROR TEST-UNEXPECTED-FAIL | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | Have localcandidate stat(s)
13:36:03     INFO -  26838 INFO TEST-PASS | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | Have remotecandidate stat(s)
13:36:03     INFO -  26839 INFO TEST-INFO | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | Run step: PC_REMOTE_CHECK_STATS
13:36:03     INFO -  26840 INFO TEST-INFO | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | PeerConnectionWrapper (pcRemote): Got stats: {"067e":{"id":"067e","timestamp":1389216963735.9194,"type":"localcandidate","candidateType":"host","ipAddress":"10.134.57.178","portNumber":43612},"0680":{"id":"0680","timestamp":1389216963735.9194,"type":"remotecandidate","candidateType":"host","ipAddress":"10.134.57.178","portNumber":33337},"07b7":{"id":"07b7","timestamp":1389216963735.9194,"type":"remotecandidate","candidateType":"serverreflexive","ipAddress":"63.245.214.82","portNumber":60979}}
13:36:03     INFO -  26841 INFO TEST-PASS | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | Spec and MapClass variant of RTCStatsReport enumeration agree
13:36:03     INFO -  26842 INFO TEST-PASS | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | Have at least 0 inboundrtp stat(s) *
13:36:03     INFO -  26843 INFO TEST-PASS | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | Have 0 outboundrtp stat(s)
13:36:03     INFO -  26844 INFO TEST-PASS | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | Have localcandidate stat(s)
13:36:03     INFO -  26845 INFO TEST-PASS | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | Have remotecandidate stat(s)
13:36:03     INFO -  26846 INFO TEST-INFO | /tests/dom/media/tests/mochitest/test_peerConnection_offerRequiresReceiveVideoAudio.html | Closing peer connections. Connection state=false
Summary: Intermittent test_peerConnection_offerRequiresReceiveVideoAudio.html | Have localcandidate stat(s) → Intermittent test_peerConnection_offerRequiresReceiveVideoAudio.html,test_peerConnection_basicAudioVideo.html | Have localcandidate stat(s)
https://tbpl.mozilla.org/php/getParsedLog.php?id=32783068&tree=Mozilla-Inbound
Summary: Intermittent test_peerConnection_offerRequiresReceiveVideoAudio.html,test_peerConnection_basicAudioVideo.html | Have localcandidate stat(s) → Intermittent test_peerConnection_offerRequiresReceiveVideoAudio.html,test_peerConnection_basicAudioVideo.html,test_peerConnection_basicAudioVideoCombined.html | Have localcandidate stat(s)
https://tbpl.mozilla.org/php/getParsedLog.php?id=32823127&tree=Mozilla-Inbound

Randall, can you please help us find an owner for this? It started a couple days ago and hits a wide variety of tests.
Flags: needinfo?(rjesup)
Oh, this is probably from bug 950855. Fits perfectly in the regression range.
Flags: needinfo?(rjesup) → needinfo?(jib)
Blocks: 950855
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #0)
> TEST-INFO | | Run step: PC_LOCAL_CHECK_STATS
> TEST-INFO | | PeerConnectionWrapper (pcLocal): Got stats:
> {"0501":{"id":"0501","timestamp":1389216963732.7678,"type":"remotecandidate",
> "candidateType":"host","ipAddress":"10.134.57.178","portNumber":43612}}

There are no local candidates here yet in pcLocal, which surprises the stats test (pcRemote seems fine). Anyone know why?

I naively thought candidates were always available at this point in the test, given my casual observation that this seemed to be true. But it looks like we have a race.

If someone knows whether this is a bug in the test harness or in this test's assumptions, please chime in.

Do I need to listen to onicecandidate events? If so, that might require some changes to the test harness...

> TEST-PASS | | Spec and MapClass variant of RTCStatsReport enumeration agree
> TEST-PASS | | Have at least 0 inboundrtp stat(s) *
> TEST-PASS | | Have 0 outboundrtp stat(s)
> TEST-UNEXPECTED-FAIL |  | Have localcandidate stat(s)
> TEST-PASS | | Have remotecandidate stat(s)
> TEST-INFO | | Run step: PC_REMOTE_CHECK_STATS
> TEST-INFO | | PeerConnectionWrapper (pcRemote): Got stats:
> {"067e":{"id":"067e","timestamp":1389216963735.9194,"type":"localcandidate",
> "candidateType":"host","ipAddress":"10.134.57.178","portNumber":43612},
> "0680":{"id":"0680","timestamp":1389216963735.9194,"type":"remotecandidate",
> "candidateType":"host","ipAddress":"10.134.57.178","portNumber":33337},
> "07b7":{"id":"07b7","timestamp":1389216963735.9194,"type":"remotecandidate",
> "candidateType":"serverreflexive","ipAddress":"63.245.214.82","portNumber":
> 60979}}
> TEST-PASS | | Spec and MapClass variant of RTCStatsReport enumeration agree
> TEST-PASS | | Have at least 0 inboundrtp stat(s) *
> TEST-PASS | | Have 0 outboundrtp stat(s)
> TEST-PASS | | Have localcandidate stat(s)
> TEST-PASS | | Have remotecandidate stat(s)
> TEST-INFO | | Closing peer connections. Connection state=false

Unless I hear otherwise, I'll nuke testing of localcandidate and remotecandidate stats for now.
Assignee: nobody → jib
Flags: needinfo?(jib)
(In reply to Jan-Ivar Bruaroey [:jib] from comment #15)

> I naively thought candidates were always available at this point in the
> test, given my casual observation that this seemed to be true. But it looks
> like we have a race.

Looking through the test steps leading up to this, it looks like we've only guaranteed that we've reached the ICE "connected" state, which means we have at least one good candidate pair. This looks like a legitimate stats bug -- in fact, the exact kind of bug that this test looks like it was designed to catch.

I've done a preliminary read over the stats collection code in the PCImpl, and don't immediately see anything wrong (although I do note that FillStatsReport_s appears to populate the local and remote candidates in lock-step, which makes the situation described above quite puzzling).

> Unless I hear otherwise, I'll nuke testing of localcandidate and remotecandidate stats for now.

Please don't. There's a real bug here. Changing the tests would just stop us from seeing the problem -- the underlying flaw would remain.
s/only guaranteed/guaranteed/
Tag.
Assignee: jib → docfaraday
(In reply to Adam Roach [:abr] from comment #16)
> (In reply to Jan-Ivar Bruaroey [:jib] from comment #15)
> 
> > I naively thought candidates were always available at this point in the
> > test, given my casual observation that this seemed to be true. But it looks
> > like we have a race.
> 
> Looking through the test steps leading up to this, it looks like we've only
> guaranteed that we've reached the ICE "connected" state, which means we have
> at least one good candidate pair. This looks like a legitimate stats bug --
> in fact, the exact kind of bug that this test looks like it was designed to
> catch.
> 
> I've done a preliminary read over the stats collection code in the PCImpl,
> and don't immediately see anything wrong (although I do note that
> FillStatsReport_s appears to populate the local and remote candidates in
> lock-step, which makes the situation described above quite puzzling).

One thing that could cause this for sure is a collision in the candidate ids. These ids are short enough that this is not all that unlikely.
Looking at the code that generates these, they are 11 bits long, so a 1 out of 2048 chance per pair. Definitely plausible that we'd see frequent intermittent failures.
One possible fix.
I am trying to reproduce with a higher r_log level, but not having any luck. This id is based on a crc of the candidate label, so it isn't really random and may only reproduce in certain circumstances. I'm going to take a crack at building the string that I think would have gone into the crc if I'm right about the cause of the bug.
Yep; the codeword generation function for the string

"IP4:10.12.51.135:57235(host(IP4:10.12.51.135:57235))" (this is the local host candidate that went missing)

yields 07e9, which is the same id for one of the remote candidates on the same agent.
Attachment #8358673 - Flags: review?(ekr)
(In reply to Byron Campen [:bwc] from comment #26)
> Yep; the codeword generation function for the string
> 
> "IP4:10.12.51.135:57235(host(IP4:10.12.51.135:57235))" (this is the local
> host candidate that went missing)
> 
> yields 07e9, which is the same id for one of the remote candidates on the
> same agent.

(This is the log in comment 18)
This is high on the Orange list (highest WebRTC bug for the last week); ekr, can you review or delegate to someone else if you can't?
Flags: needinfo?(ekr)
Looks to me like byron is already working it... It's a new bug.
Comment on attachment 8358673 [details] [diff] [review]
Make codeword longer to avoid collisions.

Review of attachment 8358673 [details] [diff] [review]:
-----------------------------------------------------------------

I don't love this.
Why don't you just remove the 2048 which will have the impact
of making this 16 bits. That will greatly reduce collisions.

If that's not enough, then let's do a better encoding (e.g.,
b64) rather than making a longer though still inefficient
encoding.
Use base64 with the original length of 4, should give acceptable collision resistance.
Attachment #8358673 - Attachment is obsolete: true
Attachment #8358673 - Flags: review?(ekr)
Fix comment that qimportbz messed up.
Attachment #8360047 - Attachment is obsolete: true
Attachment #8360048 - Flags: review?(ekr)
https://tbpl.mozilla.org/php/getParsedLog.php?id=33024512&tree=Fx-Team
Summary: Intermittent test_peerConnection_offerRequiresReceiveVideoAudio.html,test_peerConnection_basicAudioVideo.html,test_peerConnection_basicAudioVideoCombined.html | Have localcandidate stat(s) → Intermittent test_peerConnection_offerRequiresReceiveVideoAudio.html,test_peerConnection_basicAudioVideo.html,test_peerConnection_basicAudioVideoCombined.html,test_peerConnection_offerRequiresReceiveAudio.html | Have localcandidate stat(s)
Blocks: 959587
Attachment #8360048 - Flags: review?(ekr) → review?(adam)
Comment on attachment 8360048 [details] [diff] [review]
Make codeword longer to avoid collisions.

Review of attachment 8360048 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good.
Attachment #8360048 - Flags: review?(adam) → review+
Needinfoing to remind myself to check back on some rebuilds on my try push.
Flags: needinfo?(ekr) → needinfo?(docfaraday)
Flags: needinfo?(docfaraday)
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/b781647898f4
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
Duplicate of this bug: 959587
Duplicate of this bug: 958029
You need to log in before you can comment on or make changes to this bug.