Open Bug 1572143 Opened 5 years ago Updated 2 years ago

Cannot see remote users in appear.in calls and see multiple 'Your network seems to be blocking the connection' wanrings

Categories

(Core :: WebRTC: Networking, defect, P2)

defect

Tracking

()

Tracking Status
firefox69 --- unaffected
firefox70 --- disabled
firefox71 --- fix-optional

People

(Reporter: bryce, Unassigned)

References

(Regression)

Details

(Keywords: regression)

Attachments

(5 files, 1 obsolete file)

When connecting to appear.in meetings I cannot see remote connections, I instead see warning spam from appear.in: "Your network seems to be blocking the connection" (see attached screen shot).

  • For versions suffering from the issue, I also see the Windows defender firewall prompt -- I do not get this prompt on unaffected versions. However the issue persists if I allow access to Firefox via the prompt. The issue also happens if I disabled the Firewall entirely.
  • The issue usually resolves itself after 10~ seconds, and I can see the remote user.
  • Even after I can see/hear the remote user, their audio quality is often reduced, with pops/clipping.

Mozregression run (trimmed) finds:

2019-08-07T09:10:25: INFO : Narrowed inbound regression window from [155d964c, 8ec55eb4] (3 builds) to [6f82e91f, 8ec55eb4] (2 builds) (~1 steps left)
2019-08-07T09:10:25: DEBUG : Starting merge handling...
2019-08-07T09:10:25: DEBUG : Using url: https://hg.mozilla.org/integration/autoland/json-pushes?changeset=8ec55eb4aed6d7585abef387e952da42d9f5f422&full=1
2019-08-07T09:10:26: DEBUG : Found commit message:
Bug 1555792: Disable the socket process if e10s is disabled. r=kershaw

Differential Revision: https://phabricator.services.mozilla.com/D37506

2019-08-07T09:10:26: DEBUG : Did not find a branch, checking all integration branches
2019-08-07T09:10:26: INFO : The bisection is done.
2019-08-07T09:10:26: INFO : Stopped

Will attach my about:support shortly.

:bwc, any idea on how bug 1555792 could regress this? Anything I can do here to help provide more info for debugging?

Flags: needinfo?(docfaraday)

Can you attach an about:webrtc? I wonder what kind of candidate pair we end up using.

Flags: needinfo?(docfaraday)
Attached file aboutWebrtc.html (obsolete) —

about:webrtc from a call I just placed reproducing the error.

The problem seems to be that DNS resolution is broken. This is likely related to bug 1569196. Can you check on the latest nightly?

Attached file aboutWebrtc.html

Still repros in latest nightly -- manually checked the sourceStamp in platform.ini to confirm the changes form bug 1569196 are in the build I just tested. Attaching new about:webrtc from latest attempt.

Attachment #9083744 - Attachment is obsolete: true

Regression => Fix in the current release cycle => P1

Priority: -- → P1

Yeah, the Windows firewall is definitely getting in the way here. It may be configured to allow the Firefox parent process to establish network connections, but not the socket process. appear.in also seems to be getting caught in a very tight renegotiation loop; I see 75 renegotiations in comment 6. How long did this test run?

Flags: needinfo?(fippo)
Flags: needinfo?(bvandyk)

@fippo I should also point out that if you get in a tight renegotiation loop, causing lots of STUN packets very rapidly, you could trip the circuit breaker that is intended to mitigate the STUN hammer attack.

Adding a video of me reproing to show that the issue appears if I preemptively disable the Firewall and use a fresh Firefox + profile from mozregression.

Clarifying details of recording:

  • The other connection in the appear.in room is also me from another computer on an unaffected browser.
  • If the Firewall is on then the same thing happens but there's an additional prompt to allow Firefox through the Firewall that is thrown up. After okaying that the result is the same.
  • I will attach the about:webrtc log saved from exactly after the recording following this.
Flags: needinfo?(bvandyk)

As the socket process in only turned on in Nightly (and held there) I don't think this needs to be fixed in this cycle. But it is exactly the kind of problem we want to get reports about and fix before we can consider letting the socket process ride the trains.

BTW I was able to repro this in a three way call. Interestingly it did not happen when the first person joined, but only when the second person joined.

Priority: P1 → P2
Flags: needinfo?(fippo) → needinfo?(philipp.hancke)

bwc: I assume you ni'd me because you wanted me to check if our code does anything silly. The answer is a clear "maybe" ;-)

It hasn't changed since 2018 and bascially does the "when you go to disconnected, wait a bit (2s), check if you are still disconnected, then try an ice restart". It doesn't really protect against the case where an ice restart is already attempted so if you switch disconnected->connected->disconnected very often that might blow up.

If this still reproduces these days (note that the UI popup is gone now) I can get a dump and we can check how often iceconnectionstatechange fires.

Flags: needinfo?(philipp.hancke)
Has Regression Range: --- → yes
Severity: normal → S3
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: