Closed Bug 1352538 Opened 7 years ago Closed 11 months ago

Intermittent WebRtcIceGatherTest.TestGatherDNSStunServerHostnameTcp | Value of: StreamHasMatchingCandidate(0, "tcptype so")

Categories

(Core :: WebRTC: Networking, defect, P4)

defect

Tracking

()

RESOLVED INCOMPLETE
backlog tech-debt

People

(Reporter: aryx, Unassigned)

References

Details

(Keywords: intermittent-failure)

Attachments

(1 file)

https://treeherder.mozilla.org/logviewer.html#?job_id=87906189&repo=autoland

[task 2017-03-31T16:34:54.632349Z] 16:34:54     INFO -  TEST-START | WebRtcIceGatherTest.TestGatherDNSStunServerHostnameTcp
[task 2017-03-31T16:34:54.633487Z] 16:34:54     INFO -  Candidate for stream 0-P1:stream0 initialized: candidate:0 1 UDP 2122252543 172.17.0.4 33113 typ host
[task 2017-03-31T16:34:54.633933Z] 16:34:54     INFO -  Candidate for stream 0-P1:stream0 initialized: candidate:1 1 TCP 2105508095 172.17.0.4 61654 typ host tcptype passive
[task 2017-03-31T16:34:54.634323Z] 16:34:54     INFO -  Candidate for stream 0-P1:stream0 initialized: candidate:1 1 TCP 2105524479 172.17.0.4 9 typ host tcptype active
[task 2017-03-31T16:34:54.634562Z] 16:34:54     INFO -  P1 Gathering complete
[task 2017-03-31T16:34:54.634799Z] 16:34:54     INFO -  P1 CANDIDATES:
[task 2017-03-31T16:34:54.635152Z] 16:34:54     INFO -  Stream P1
[task 2017-03-31T16:34:54.635450Z] 16:34:54     INFO -  candidate:0 1 UDP 2122252543 172.17.0.4 33113 typ host
[task 2017-03-31T16:34:54.635711Z] 16:34:54     INFO -  candidate:1 1 TCP 2105508095 172.17.0.4 61654 typ host tcptype passive
[task 2017-03-31T16:34:54.635942Z] 16:34:54     INFO -  candidate:1 1 TCP 2105524479 172.17.0.4 9 typ host tcptype active
[task 2017-03-31T16:34:54.636176Z] 16:34:54     INFO -  P1 Returning candidate: candidate:0 1 UDP 2122252543 172.17.0.4 33113 typ host
[task 2017-03-31T16:34:54.636418Z] 16:34:54     INFO -  P1 Returning candidate: candidate:1 1 TCP 2105508095 172.17.0.4 61654 typ host tcptype passive
[task 2017-03-31T16:34:54.636647Z] 16:34:54     INFO -  P1 Returning candidate: candidate:1 1 TCP 2105524479 172.17.0.4 9 typ host tcptype active
[task 2017-03-31T16:34:54.636869Z] 16:34:54     INFO -  P1 Returning candidate: candidate:0 1 UDP 2122252543 172.17.0.4 33113 typ host
[task 2017-03-31T16:34:54.637102Z] 16:34:54     INFO -  P1 Returning candidate: candidate:1 1 TCP 2105508095 172.17.0.4 61654 typ host tcptype passive
[task 2017-03-31T16:34:54.637358Z] 16:34:54     INFO -  P1 Returning candidate: candidate:1 1 TCP 2105524479 172.17.0.4 9 typ host tcptype active
[task 2017-03-31T16:34:54.637584Z] 16:34:54     INFO -  P1 Returning candidate: candidate:0 1 UDP 2122252543 172.17.0.4 33113 typ host
[task 2017-03-31T16:34:54.637830Z] 16:34:54     INFO -  P1 Returning candidate: candidate:1 1 TCP 2105508095 172.17.0.4 61654 typ host tcptype passive
[task 2017-03-31T16:34:54.638059Z] 16:34:54     INFO -  P1 Returning candidate: candidate:1 1 TCP 2105524479 172.17.0.4 9 typ host tcptype active
[task 2017-03-31T16:34:54.638311Z] 16:34:54  WARNING -  TEST-UNEXPECTED-FAIL | WebRtcIceGatherTest.TestGatherDNSStunServerHostnameTcp | Value of: StreamHasMatchingCandidate(0, "tcptype so")
[task 2017-03-31T16:34:54.638591Z] 16:34:54     INFO -    Actual: false
[task 2017-03-31T16:34:54.638852Z] 16:34:54     INFO -  Expected: true @ /home/worker/workspace/build/src/media/mtransport/test/ice_unittest.cpp:2295
[task 2017-03-31T16:34:54.639064Z] 16:34:54     INFO -  insert 'ice' (registry) succeeded: ice
[task 2017-03-31T16:34:54.639295Z] 16:34:54     INFO -  insert 'ice.pref' (registry) succeeded: ice.pref
[task 2017-03-31T16:34:54.639546Z] 16:34:54     INFO -  insert 'ice.pref.type' (registry) succeeded: ice.pref.type
[task 2017-03-31T16:34:54.639777Z] 16:34:54     INFO -  insert 'ice.pref.type.srv_rflx' (UCHAR) succeeded: 0x64
[task 2017-03-31T16:34:54.640004Z] 16:34:54     INFO -  insert 'ice.pref.type.peer_rflx' (UCHAR) succeeded: 0x6e
[task 2017-03-31T16:34:54.640230Z] 16:34:54     INFO -  insert 'ice.pref.type.host' (UCHAR) succeeded: 0x7e
[task 2017-03-31T16:34:54.640454Z] 16:34:54     INFO -  insert 'ice.pref.type.relayed' (UCHAR) succeeded: 0x05
[task 2017-03-31T16:34:54.640684Z] 16:34:54     INFO -  insert 'ice.pref.type.srv_rflx_tcp' (UCHAR) succeeded: 0x63
[task 2017-03-31T16:34:54.640910Z] 16:34:54     INFO -  insert 'ice.pref.type.peer_rflx_tcp' (UCHAR) succeeded: 0x6d
[task 2017-03-31T16:34:54.641135Z] 16:34:54     INFO -  insert 'ice.pref.type.host_tcp' (UCHAR) succeeded: 0x7d
[task 2017-03-31T16:34:54.641363Z] 16:34:54     INFO -  insert 'ice.pref.type.relayed_tcp' (UCHAR) succeeded: 0x00
[task 2017-03-31T16:34:54.641585Z] 16:34:54     INFO -  insert 'stun' (registry) succeeded: stun
[task 2017-03-31T16:34:54.641837Z] 16:34:54     INFO -  insert 'stun.client' (registry) succeeded: stun.client
[task 2017-03-31T16:34:54.642068Z] 16:34:54     INFO -  insert 'stun.client.maximum_transmits' (UINT4) succeeded: 7
[task 2017-03-31T16:34:54.642295Z] 16:34:54     INFO -  insert 'ice.trickle_grace_period' (UINT4) succeeded: 5000
[task 2017-03-31T16:34:54.642517Z] 16:34:54     INFO -  insert 'ice.tcp' (registry) succeeded: ice.tcp
[task 2017-03-31T16:34:54.642740Z] 16:34:54     INFO -  insert 'ice.tcp.so_sock_count' (INT4) succeeded: 3
[task 2017-03-31T16:34:54.642971Z] 16:34:54     INFO -  insert 'ice.tcp.listen_backlog' (INT4) succeeded: 10
[task 2017-03-31T16:34:54.643193Z] 16:34:54     INFO -  insert 'ice.tcp.disable' (char) succeeded: \000
[task 2017-03-31T16:34:54.643408Z] 16:34:54     INFO -  P1 Shutdown
[task 2017-03-31T16:34:55.641167Z] 16:34:55  WARNING -  TEST-UNEXPECTED-FAIL | WebRtcIceGatherTest.TestGatherDNSStunServerHostnameTcp | test completed (time: 1004ms)
What do you make of this? Seems _really_ weird to be missing the tcp so candidate. I don't think this will hurt us much at all in the real world, but it would be good to understand what is happening here.
backlog: --- → tech-debt
Rank: 35
Flags: needinfo?(drno)
Priority: -- → P3
I'm not sure what could be going on here. This is definitely not affecting users, since the so_count is set to 0 in the user prefs to disable SO all together.

I guess the random port number generator her http://searchfox.org/mozilla-central/source/media/mtransport/third_party/nICEr/src/ice/ice_component.c#359 could fail to find an unused port within three attempts?!
Would be nice if we could get the nICEr error log in case of test failures.
Flags: needinfo?(drno)
(In reply to Nils Ohlmeier [:drno] from comment #3)
> I guess the random port number generator her
> http://searchfox.org/mozilla-central/source/media/mtransport/third_party/
> nICEr/src/ice/ice_component.c#359 could fail to find an unused port within
> three attempts?!
> Would be nice if we could get the nICEr error log in case of test failures.

Sure, if a thousand ports are used, it might have a non-zero chance of failing 3 times.  Also, if we implement restricted ranges for ports, as we've discussed, it may become much more likely.

Of course, that might not be the problem here, but easy fix.
Attachment #8854146 - Flags: review?(docfaraday)
I think the only way to understand what is going on here is to have the unit tests actually log at debug level and not just at info. I'll look into adding a function to RLogConnector to increase the default log level, so that we can increase the log level for this unit test until we catch a failure with debug log level.
It is strange. The interface gathering appears to succeed. And we get some candidates. And it's not a gathering failure. It acts as if so_count is zero, which should be the case for Firefox from the prefs, but not for the unit test (as the log shows it uses the default value 3 for so_count).
My guess is this is related to Michael's patch which changes how we find the interface addresses.
I think the default for the unit test is coming from here:
https://dxr.mozilla.org/mozilla-central/rev/b5d8b27a753725c1de41ffae2e338798f3b5cacd/media/mtransport/test/ice_unittest.cpp#1486

I walked through code and it doesn't appear that any of the globals setup in NrIceCtx::InitializeGlobals are used in the stun address gathering.  The only reference I can find to NR_ICE_REG_ICE_TCP_SO_SOCK_COUNT (which uses media.peerconnection.ice.tcp_so_sock_count), is here in ice_component.c:
https://dxr.mozilla.org/mozilla-central/rev/b5d8b27a753725c1de41ffae2e338798f3b5cacd/media/mtransport/third_party/nICEr/src/ice/ice_component.c#423

I don't believe that ice_component is used during during stun address discovery.
(In reply to Michael Froman [:mjf] from comment #9)
> I think the default for the unit test is coming from here:
> https://dxr.mozilla.org/mozilla-central/rev/
> b5d8b27a753725c1de41ffae2e338798f3b5cacd/media/mtransport/test/ice_unittest.
> cpp#1486

Hadn't notice that.
 
> I walked through code and it doesn't appear that any of the globals setup in
> NrIceCtx::InitializeGlobals are used in the stun address gathering.  The
> only reference I can find to NR_ICE_REG_ICE_TCP_SO_SOCK_COUNT (which uses
> media.peerconnection.ice.tcp_so_sock_count), is here in ice_component.c:
> https://dxr.mozilla.org/mozilla-central/rev/
> b5d8b27a753725c1de41ffae2e338798f3b5cacd/media/mtransport/third_party/nICEr/
> src/ice/ice_component.c#423
> 
> I don't believe that ice_component is used during during stun address
> discovery.

I just verified with a debugger that in fact it gets called and the so_sock_count properly gets set to 3.

Lets see if debug logging will tell us more.
Mass change P3->P4 to align with new Mozilla triage process.
Priority: P3 → P4
Severity: normal → S3
Status: NEW → RESOLVED
Closed: 11 months ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: