Closed Bug 1276805 Opened 4 years ago Closed 3 years ago

Intermittent test_peerConnection_basicAudioNATSrflx.html | PeerConnectionWrapper (pcLocal): legal ICE state transition from connected to failed

Categories

(Core :: WebRTC, defect, P2)

x86
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla50
Tracking Status
firefox49 --- fixed
firefox50 --- fixed
Blocking Flags:
backlog tech-debt

People

(Reporter: philor, Assigned: bwc)

Details

(Keywords: intermittent-failure)

Attachments

(1 file)

Looks like consent refreshes are timing out. Looking closer.
Hmm. Seeing signs that consent freshness checks aren't working at all for the NAT tests.
This is _probably_ test only, because if consent freshness was straight-up broken on all connections using srflx or relay, we would have noticed.
backlog: --- → tech-debt
Rank: 25
Priority: -- → P2
Byron: this is starting to spike (perhaps due to perf or VM class changes or loading).  Can we extend or disable consent freshness (pref?) in automation until it works?
Flags: needinfo?(docfaraday)
@jesup: there is no option to turn off consent freshness
Flags: needinfo?(docfaraday)
Found this in the raw log of the first failed test run from comment #1:

10:41:12     INFO -  (generic/INFO) TestNrSocket IP4:10.134.157.154:37554/UDP received from IP4:10.134.157.154:40193/UDP via IP4:10.134.157.154:40561/UDP
10:41:12     INFO -  (ice/NOTICE) ICE(PC:1464630018674714 (id=2147483761 url=http://mochi.test:8888/tests/dom/media/tests/mochitest/test_peerConnection_basicAudioNATSrf): Message does not correspond to any registered stun ctx

Looks like the NAT simulator lets something pass, which does not match any registered STUN ctx?! Is is possible that port assignments change in the NAT simulator over time? Or what else could explain this?
Flags: needinfo?(docfaraday)
Not sure. I suspect it is just an interaction between the NAT simulator and nICEr, but it could be a problem with real NATs too that is intermittent. I should be able to investigate today.
Flags: needinfo?(docfaraday)
It seems that this only happens on linux e10s, and mostly on debug. Might just be a performance problem?
OS: Unspecified → Linux
Hardware: Unspecified → x86
Assignee: nobody → docfaraday
I'd be willing to bet that we are simply bogging down these VMs with logging to the point that consent freshness times out. When the NAT simulator is active, we get a bunch more logging at INFO. Maybe if we move that to DEBUG we'll be fine.
Attachment #8762809 - Flags: review?(drno) → review+
https://hg.mozilla.org/mozilla-central/rev/5a88d105f01b
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
You need to log in before you can comment on or make changes to this bug.