Intermittent WebRtcIceGatherTest.TestGatherDNSStunServerHostnameTcp | Value of: StreamHasMatchingCandidate(0, "tcptype so")

NEW
Unassigned

Status

()

Core
WebRTC: Networking
P4
normal
Rank:
35
9 months ago
a month ago

People

(Reporter: aryx, Unassigned)

Tracking

({intermittent-failure})

unspecified
intermittent-failure
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(1 attachment)

https://treeherder.mozilla.org/logviewer.html#?job_id=87906189&repo=autoland

[task 2017-03-31T16:34:54.632349Z] 16:34:54     INFO -  TEST-START | WebRtcIceGatherTest.TestGatherDNSStunServerHostnameTcp
[task 2017-03-31T16:34:54.633487Z] 16:34:54     INFO -  Candidate for stream 0-P1:stream0 initialized: candidate:0 1 UDP 2122252543 172.17.0.4 33113 typ host
[task 2017-03-31T16:34:54.633933Z] 16:34:54     INFO -  Candidate for stream 0-P1:stream0 initialized: candidate:1 1 TCP 2105508095 172.17.0.4 61654 typ host tcptype passive
[task 2017-03-31T16:34:54.634323Z] 16:34:54     INFO -  Candidate for stream 0-P1:stream0 initialized: candidate:1 1 TCP 2105524479 172.17.0.4 9 typ host tcptype active
[task 2017-03-31T16:34:54.634562Z] 16:34:54     INFO -  P1 Gathering complete
[task 2017-03-31T16:34:54.634799Z] 16:34:54     INFO -  P1 CANDIDATES:
[task 2017-03-31T16:34:54.635152Z] 16:34:54     INFO -  Stream P1
[task 2017-03-31T16:34:54.635450Z] 16:34:54     INFO -  candidate:0 1 UDP 2122252543 172.17.0.4 33113 typ host
[task 2017-03-31T16:34:54.635711Z] 16:34:54     INFO -  candidate:1 1 TCP 2105508095 172.17.0.4 61654 typ host tcptype passive
[task 2017-03-31T16:34:54.635942Z] 16:34:54     INFO -  candidate:1 1 TCP 2105524479 172.17.0.4 9 typ host tcptype active
[task 2017-03-31T16:34:54.636176Z] 16:34:54     INFO -  P1 Returning candidate: candidate:0 1 UDP 2122252543 172.17.0.4 33113 typ host
[task 2017-03-31T16:34:54.636418Z] 16:34:54     INFO -  P1 Returning candidate: candidate:1 1 TCP 2105508095 172.17.0.4 61654 typ host tcptype passive
[task 2017-03-31T16:34:54.636647Z] 16:34:54     INFO -  P1 Returning candidate: candidate:1 1 TCP 2105524479 172.17.0.4 9 typ host tcptype active
[task 2017-03-31T16:34:54.636869Z] 16:34:54     INFO -  P1 Returning candidate: candidate:0 1 UDP 2122252543 172.17.0.4 33113 typ host
[task 2017-03-31T16:34:54.637102Z] 16:34:54     INFO -  P1 Returning candidate: candidate:1 1 TCP 2105508095 172.17.0.4 61654 typ host tcptype passive
[task 2017-03-31T16:34:54.637358Z] 16:34:54     INFO -  P1 Returning candidate: candidate:1 1 TCP 2105524479 172.17.0.4 9 typ host tcptype active
[task 2017-03-31T16:34:54.637584Z] 16:34:54     INFO -  P1 Returning candidate: candidate:0 1 UDP 2122252543 172.17.0.4 33113 typ host
[task 2017-03-31T16:34:54.637830Z] 16:34:54     INFO -  P1 Returning candidate: candidate:1 1 TCP 2105508095 172.17.0.4 61654 typ host tcptype passive
[task 2017-03-31T16:34:54.638059Z] 16:34:54     INFO -  P1 Returning candidate: candidate:1 1 TCP 2105524479 172.17.0.4 9 typ host tcptype active
[task 2017-03-31T16:34:54.638311Z] 16:34:54  WARNING -  TEST-UNEXPECTED-FAIL | WebRtcIceGatherTest.TestGatherDNSStunServerHostnameTcp | Value of: StreamHasMatchingCandidate(0, "tcptype so")
[task 2017-03-31T16:34:54.638591Z] 16:34:54     INFO -    Actual: false
[task 2017-03-31T16:34:54.638852Z] 16:34:54     INFO -  Expected: true @ /home/worker/workspace/build/src/media/mtransport/test/ice_unittest.cpp:2295
[task 2017-03-31T16:34:54.639064Z] 16:34:54     INFO -  insert 'ice' (registry) succeeded: ice
[task 2017-03-31T16:34:54.639295Z] 16:34:54     INFO -  insert 'ice.pref' (registry) succeeded: ice.pref
[task 2017-03-31T16:34:54.639546Z] 16:34:54     INFO -  insert 'ice.pref.type' (registry) succeeded: ice.pref.type
[task 2017-03-31T16:34:54.639777Z] 16:34:54     INFO -  insert 'ice.pref.type.srv_rflx' (UCHAR) succeeded: 0x64
[task 2017-03-31T16:34:54.640004Z] 16:34:54     INFO -  insert 'ice.pref.type.peer_rflx' (UCHAR) succeeded: 0x6e
[task 2017-03-31T16:34:54.640230Z] 16:34:54     INFO -  insert 'ice.pref.type.host' (UCHAR) succeeded: 0x7e
[task 2017-03-31T16:34:54.640454Z] 16:34:54     INFO -  insert 'ice.pref.type.relayed' (UCHAR) succeeded: 0x05
[task 2017-03-31T16:34:54.640684Z] 16:34:54     INFO -  insert 'ice.pref.type.srv_rflx_tcp' (UCHAR) succeeded: 0x63
[task 2017-03-31T16:34:54.640910Z] 16:34:54     INFO -  insert 'ice.pref.type.peer_rflx_tcp' (UCHAR) succeeded: 0x6d
[task 2017-03-31T16:34:54.641135Z] 16:34:54     INFO -  insert 'ice.pref.type.host_tcp' (UCHAR) succeeded: 0x7d
[task 2017-03-31T16:34:54.641363Z] 16:34:54     INFO -  insert 'ice.pref.type.relayed_tcp' (UCHAR) succeeded: 0x00
[task 2017-03-31T16:34:54.641585Z] 16:34:54     INFO -  insert 'stun' (registry) succeeded: stun
[task 2017-03-31T16:34:54.641837Z] 16:34:54     INFO -  insert 'stun.client' (registry) succeeded: stun.client
[task 2017-03-31T16:34:54.642068Z] 16:34:54     INFO -  insert 'stun.client.maximum_transmits' (UINT4) succeeded: 7
[task 2017-03-31T16:34:54.642295Z] 16:34:54     INFO -  insert 'ice.trickle_grace_period' (UINT4) succeeded: 5000
[task 2017-03-31T16:34:54.642517Z] 16:34:54     INFO -  insert 'ice.tcp' (registry) succeeded: ice.tcp
[task 2017-03-31T16:34:54.642740Z] 16:34:54     INFO -  insert 'ice.tcp.so_sock_count' (INT4) succeeded: 3
[task 2017-03-31T16:34:54.642971Z] 16:34:54     INFO -  insert 'ice.tcp.listen_backlog' (INT4) succeeded: 10
[task 2017-03-31T16:34:54.643193Z] 16:34:54     INFO -  insert 'ice.tcp.disable' (char) succeeded: \000
[task 2017-03-31T16:34:54.643408Z] 16:34:54     INFO -  P1 Shutdown
[task 2017-03-31T16:34:55.641167Z] 16:34:55  WARNING -  TEST-UNEXPECTED-FAIL | WebRtcIceGatherTest.TestGatherDNSStunServerHostnameTcp | test completed (time: 1004ms)

Comment 1

8 months ago
What do you make of this? Seems _really_ weird to be missing the tcp so candidate. I don't think this will hurt us much at all in the real world, but it would be good to understand what is happening here.
backlog: --- → tech-debt
Rank: 35
Flags: needinfo?(drno)
Priority: -- → P3
Comment hidden (mozreview-request)

Comment 3

8 months ago
I'm not sure what could be going on here. This is definitely not affecting users, since the so_count is set to 0 in the user prefs to disable SO all together.

I guess the random port number generator her http://searchfox.org/mozilla-central/source/media/mtransport/third_party/nICEr/src/ice/ice_component.c#359 could fail to find an unused port within three attempts?!
Would be nice if we could get the nICEr error log in case of test failures.
Flags: needinfo?(drno)
(In reply to Nils Ohlmeier [:drno] from comment #3)
> I guess the random port number generator her
> http://searchfox.org/mozilla-central/source/media/mtransport/third_party/
> nICEr/src/ice/ice_component.c#359 could fail to find an unused port within
> three attempts?!
> Would be nice if we could get the nICEr error log in case of test failures.

Sure, if a thousand ports are used, it might have a non-zero chance of failing 3 times.  Also, if we implement restricted ranges for ports, as we've discussed, it may become much more likely.

Of course, that might not be the problem here, but easy fix.
Comment hidden (mozreview-request)

Updated

8 months ago
Attachment #8854146 - Flags: review?(docfaraday)

Comment 6

8 months ago
I think the only way to understand what is going on here is to have the unit tests actually log at debug level and not just at info. I'll look into adding a function to RLogConnector to increase the default log level, so that we can increase the log level for this unit test until we catch a failure with debug log level.
Comment hidden (mozreview-request)

Comment 8

8 months ago
It is strange. The interface gathering appears to succeed. And we get some candidates. And it's not a gathering failure. It acts as if so_count is zero, which should be the case for Firefox from the prefs, but not for the unit test (as the log shows it uses the default value 3 for so_count).
My guess is this is related to Michael's patch which changes how we find the interface addresses.

Comment 9

8 months ago
I think the default for the unit test is coming from here:
https://dxr.mozilla.org/mozilla-central/rev/b5d8b27a753725c1de41ffae2e338798f3b5cacd/media/mtransport/test/ice_unittest.cpp#1486

I walked through code and it doesn't appear that any of the globals setup in NrIceCtx::InitializeGlobals are used in the stun address gathering.  The only reference I can find to NR_ICE_REG_ICE_TCP_SO_SOCK_COUNT (which uses media.peerconnection.ice.tcp_so_sock_count), is here in ice_component.c:
https://dxr.mozilla.org/mozilla-central/rev/b5d8b27a753725c1de41ffae2e338798f3b5cacd/media/mtransport/third_party/nICEr/src/ice/ice_component.c#423

I don't believe that ice_component is used during during stun address discovery.
(In reply to Michael Froman [:mjf] from comment #9)
> I think the default for the unit test is coming from here:
> https://dxr.mozilla.org/mozilla-central/rev/
> b5d8b27a753725c1de41ffae2e338798f3b5cacd/media/mtransport/test/ice_unittest.
> cpp#1486

Hadn't notice that.
 
> I walked through code and it doesn't appear that any of the globals setup in
> NrIceCtx::InitializeGlobals are used in the stun address gathering.  The
> only reference I can find to NR_ICE_REG_ICE_TCP_SO_SOCK_COUNT (which uses
> media.peerconnection.ice.tcp_so_sock_count), is here in ice_component.c:
> https://dxr.mozilla.org/mozilla-central/rev/
> b5d8b27a753725c1de41ffae2e338798f3b5cacd/media/mtransport/third_party/nICEr/
> src/ice/ice_component.c#423
> 
> I don't believe that ice_component is used during during stun address
> discovery.

I just verified with a debugger that in fact it gets called and the so_sock_count properly gets set to 3.

Lets see if debug logging will tell us more.

Comment 11

6 months ago
1 failures in 814 pushes (0.001 failures/push) were associated with this bug in the last 7 days.   

Repository breakdown:
* try: 1

Platform breakdown:
* linux32: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1352538&startday=2017-06-12&endday=2017-06-18&tree=all

Comment 12

5 months ago
1 failures in 718 pushes (0.001 failures/push) were associated with this bug in the last 7 days.   

Repository breakdown:
* mozilla-esr52: 1

Platform breakdown:
* linux64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1352538&startday=2017-06-26&endday=2017-07-02&tree=all

Comment 13

5 months ago
1 failures in 656 pushes (0.002 failures/push) were associated with this bug in the last 7 days.   

Repository breakdown:
* autoland: 1

Platform breakdown:
* osx-cross: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1352538&startday=2017-07-03&endday=2017-07-09&tree=all

Comment 14

5 months ago
1 failures in 720 pushes (0.001 failures/push) were associated with this bug in the last 7 days.   

Repository breakdown:
* autoland: 1

Platform breakdown:
* linux32: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1352538&startday=2017-07-10&endday=2017-07-16&tree=all

Comment 15

4 months ago
2 failures in 1008 pushes (0.002 failures/push) were associated with this bug in the last 7 days.   

Repository breakdown:
* try: 1
* mozilla-inbound: 1

Platform breakdown:
* linux64: 1
* linux32: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1352538&startday=2017-07-24&endday=2017-07-30&tree=all

Comment 16

4 months ago
1 failures in 888 pushes (0.001 failures/push) were associated with this bug in the last 7 days.   

Repository breakdown:
* mozilla-inbound: 1

Platform breakdown:
* linux32: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1352538&startday=2017-07-31&endday=2017-08-06&tree=all

Comment 17

4 months ago
1 failures in 901 pushes (0.001 failures/push) were associated with this bug in the last 7 days.   

Repository breakdown:
* autoland: 1

Platform breakdown:
* osx-10-10: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1352538&startday=2017-08-07&endday=2017-08-13&tree=all

Comment 18

4 months ago
1 failures in 949 pushes (0.001 failures/push) were associated with this bug in the last 7 days.   

Repository breakdown:
* autoland: 1

Platform breakdown:
* osx-10-10: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1352538&startday=2017-08-14&endday=2017-08-20&tree=all

Comment 19

3 months ago
2 failures in 924 pushes (0.002 failures/push) were associated with this bug in the last 7 days.   

Repository breakdown:
* mozilla-inbound: 2

Platform breakdown:
* linux64: 1
* linux32: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1352538&startday=2017-09-04&endday=2017-09-10&tree=all
Mass change P3->P4 to align with new Mozilla triage process.
Priority: P3 → P4

Comment 21

2 months ago
1 failures in 947 pushes (0.001 failures/push) were associated with this bug in the last 7 days.    

Repository breakdown:
* mozilla-inbound: 1

Platform breakdown:
* linux32: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1352538&startday=2017-10-09&endday=2017-10-15&tree=all
Duplicate of this bug: 1255287

Comment 23

a month ago
1 failures in 857 pushes (0.001 failures/push) were associated with this bug in the last 7 days.    

Repository breakdown:
* mozilla-central: 1

Platform breakdown:
* linux64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1352538&startday=2017-10-30&endday=2017-11-05&tree=all
You need to log in before you can comment on or make changes to this bug.