Closed Bug 1318180 Opened 3 years ago Closed 3 years ago

Cannot createOffer after network change

Categories

(Core :: WebRTC: Networking, defect, P2)

50 Branch
Unspecified
macOS
defect

Tracking

()

RESOLVED FIXED
mozilla53
Tracking Status
firefox53 --- fixed
Blocking Flags:

People

(Reporter: arungane, Assigned: drno)

References

()

Details

(Keywords: cisco-spark, Whiteboard: [webrtc])

Attachments

(1 file)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:50.0) Gecko/20100101 Firefox/50.0
Build ID: 20161104212021

Steps to reproduce:

Made a call from party A to Party B using Firefox 50 on mac .
after the call is established i turned off my wifi for 3-5 sec and turned it on again.

Then tried to do a ice-restart as the IP address has changed 


Actual results:

Alice get an Error while creating an offer to restart ice.
 Failed to create session description: InvalidStateError: RTCPeerConnection is gone (did you enter Offline mode?)


Expected results:

Alice should be able to create an offer by setting iceRestart as true. And send the new SDP to Bob to start the DTLS negotiation
Can use this site to test : https://webrtc.github.io/samples/src/content/peerconnection/restart-ice/
Keywords: cisco-spark
OS: Unspecified → Mac OS X
Product: Firefox → Core
Whiteboard: [webrtc]
Component: Untriaged → WebRTC: Networking
Sorry but I think that is not a use case which we support (assuming the WiFi is/was your only IP connection), because turning off the WiFi puts Firefox into the offline mode and that results in all PeerConnections getting destroyed.

If you 
- start your call for example on WiFi
- then plugin Ethernet
- wait until the Ethernet has an IP address
- then turn off the WiFi
- then I would expect the ICE restart to switch your running call to the Ethernet

Byron, Michael what do you think about this one?
Flags: needinfo?(mfroman)
Flags: needinfo?(docfaraday)
Hmm. So if we just go to disconnected when this happens, content will have to wait until we're out of offline mode before it will be able to pass any signaling. But that seems fine I guess. Maybe there's some other reason we decided to tear all PCs down when we go offline?
Flags: needinfo?(docfaraday)
thanks nils and byron. I think we should have a timer to before killing peerconnection similar to chrome ,10 -30 sec .

Does the peerconnection gets destroyed only when the network is totally off ?
what would happen if there is only one wifi connected and its flaky ( turns off and on sometimes) during the call
(In reply to arun ganeshan from comment #4)
> thanks nils and byron. I think we should have a timer to before killing
> peerconnection similar to chrome ,10 -30 sec .
> 
> Does the peerconnection gets destroyed only when the network is totally off ?
> what would happen if there is only one wifi connected and its flaky ( turns
> off and on sometimes) during the call

The killing of connections was meant to handle the user selecting "Work Offline", and perhaps making connection failures noticable quickly if a router when down or you disconnected (permanently).

It wasn't meant to catch temporary network interruptions - though if the network is down long enough you might want to consider killing it.

Perhaps we should start a timer, or simply remove the "no network" close of PeerConnections entirely (and let things time out normally)
(In reply to Randell Jesup [:jesup] from comment #5)
> (In reply to arun ganeshan from comment #4)
> > thanks nils and byron. I think we should have a timer to before killing
> > peerconnection similar to chrome ,10 -30 sec .
> > 
> > Does the peerconnection gets destroyed only when the network is totally off ?
> > what would happen if there is only one wifi connected and its flaky ( turns
> > off and on sometimes) during the call
> 
> The killing of connections was meant to handle the user selecting "Work
> Offline", and perhaps making connection failures noticable quickly if a
> router when down or you disconnected (permanently).
> 
> It wasn't meant to catch temporary network interruptions - though if the
> network is down long enough you might want to consider killing it.
> 
> Perhaps we should start a timer, or simply remove the "no network" close of
> PeerConnections entirely (and let things time out normally)

Thanks Randell , In chrome when network changes / turns off they changes the ice state to "Disconnected" wait for 30 sec and then turn the ice state to failed 

https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/iceConnectionState

The last options sounds good.
Sorry to jump in late.  I agree with removing the "no network" close of PeerConnections if possible.  That would give us the option for a restart in the disconnected state.
Flags: needinfo?(mfroman)
I like the idea to let the "offline" event trigger the ICE connection state "disconnected", which we have available now since bug 929977 landed. And the ICE consent freshness should automatically take care of switching to failed after 30 seconds.
Assignee: nobody → drno
Status: UNCONFIRMED → NEW
backlog: --- → webrtc/webaudio+
Rank: 25
Ever confirmed: true
Priority: -- → P2
So with the patch in attachment #8812080 [details] the behavior is kind of interesting:
- if you choose "Work offline" from the File menu that triggers "disconnected", pretty quickly followed the ICE layer getting errors when trying to send consent freshness requests and therefore then switching to "failed", which results in network sockets getting closed. I have not tested if an ICE restart would still work once you leave the "Work offline" state.
- if I turn off the WiFi on my laptop again "disconnected" gets triggered, and when I turn the WiFi back on (I get the same IP address so sockets keep working) RTP continous to flow and the ICE connection state goes back to "connected".

I like the outcome, but thought it's worth documenting it here since it's not the obvious result.
Comment on attachment 8812080 [details]
Bug 1318180: turn network offline events into ice disconnected event.

https://reviewboard.mozilla.org/r/93984/#review94652

Lgtm. Is there a way to test this? Perhaps fire "network:offline-about-to-go-offline" ?

::: dom/media/PeerConnection.js:176
(Diff revision 1)
> +    let disconnectIce = function(pcref) {
> +      let pc = pcref.get();
> +      if (pc) {
> +        pc.changeIceConnectionState('disconnected');
> +      }
> +    };
> +
> +    let disconnectPcs = function(list, winID) {
> +      if (list.hasOwnProperty(winID)) {
> +        list[winID].forEach(disconnectIce);
> +      }
> +    };

I'm fine with inlining these if you want.
Attachment #8812080 - Flags: review?(jib) → review+
Comment on attachment 8812080 [details]
Bug 1318180: turn network offline events into ice disconnected event.

https://reviewboard.mozilla.org/r/93984/#review94652

> I'm fine with inlining these if you want.

Meant to be a comment. Didn't mean to open an issue. Up to you.
Comment on attachment 8812080 [details]
Bug 1318180: turn network offline events into ice disconnected event.

https://reviewboard.mozilla.org/r/93984/#review95966

::: dom/media/PeerConnection.js:179
(Diff revision 1)
>      };
>  
> +    let disconnectIce = function(pcref) {
> +      let pc = pcref.get();
> +      if (pc) {
> +        pc.changeIceConnectionState('disconnected');

Hmm, so this updates the connection state in JS-land, but doesn't touch the state in c++ land. What happens if the network comes back up before consent freshness detects that something is wrong? Will we ever transition back to connected?
@jib: can you please review the PeerConnection.js part again?
@bwc: another review please :-)
Flags: needinfo?(jib)
Flags: needinfo?(docfaraday)
There are more changes regarding network disconnet in the pipe in bug 1321649. But I decided to take them into a new bug, because they appear to open another can of worms right now.
See Also: → 1321649
Comment on attachment 8812080 [details]
Bug 1318180: turn network offline events into ice disconnected event.

https://reviewboard.mozilla.org/r/93984/#review97618

::: dom/media/PeerConnection.js:176
(Diff revision 4)
>          list[winID].forEach(cleanupPcRef);
>          delete list[winID];
>        }
>      };
>  
> +    let networkChangeNotification = function(list, winID, online, networkdown) {

Let's make this a member method so we don't have to pass so many args (it's confusing me which is the new state and which is the present state here).

Also, can we rename it to use an active verb?

'Notify' makes me think of notifying JS, but we mean 'notify down' here to other (c++) code, which I think creates a mental divide we don't need. How about `updateNetworkState` again?

::: dom/media/PeerConnection.js:177
(Diff revision 4)
> +      logMsg("networkChangeNotification: " + online, "PeerConnection.js", 177,
> +              Ci.nsIScriptError.warningFlag, winID);

Remove logging. Operational logging is better done in c++ which has several developer-operated logging opportunities available.

PeerConnection.js generally does no operational logging. Also, manual line numbers are not ok as they quickly go stale.

logMsg is a helper for logWarning et al, and goes to web console when something is wrong with the web dev's script. There's nothing they can do here.

::: dom/media/PeerConnection.js:183
(Diff revision 4)
> +            if (!online) {
> +              pc.changeIceConnectionState('disconnected');
> +            }
> +            pc._pc.networkStateChanged(online);

Any reason not to call c++ first here?

We tend to want to put as little code as possible after notifying content JS, since dispatchEvent is synchronous, and anything can happen when we call content JS (this is even a security issue in some browsers, though I think we're generally ok on this, but it still makes me nervous).

Therefore it seems better to ensure our internal state is updated first here.

::: dom/media/PeerConnection.js:208
(Diff revision 4)
> -        // this._list shold be empty here
> +        for (let winId in this._list) {
> +          networkChangeNotification(this._list, winId, false, this._networkdown);
> +        }
>          this._networkdown = true;
>        } else if (data == "online") {
> +        for (let winId in this._list) {
> +          networkChangeNotification(this._list, winId, true, this._networkdown);
> +        }

This is fine with me (I'm sure I do this elsewhere), though I know others disapprove of boolean args because their meaning is hard to deduce at the call site (going as far as landing ugly comments in-between args). I suppose you could avoid this by doing something like:

    updateOnlineState: function(online = true) { /* your code */ },
    updateOfflineState: function() { return updateOnlineState(false); },

Up to you.

::: dom/webidl/PeerConnectionImpl.webidl:76
(Diff revision 4)
>     * into the SDP.
>     */
>    [Throws]
>    void addIceCandidate(DOMString candidate, DOMString mid, unsigned short level);
>  
> +  void networkStateChanged(boolean online);

Would an active verb be better? updateNetworkState?

Note that you need a DOM review for this to land. Silly, I know, but I don't think they've put in an exception for this webidl file yet. Should be a rubber stamp.

::: media/mtransport/nricectx.cpp:978
(Diff revision 4)
>  
> +void NrIceCtx::NetworkStateChanged(bool online) {
> +  MOZ_MTLOG(ML_INFO, "NrIceCtx(" << name_ << "): network state changed to " <<
> +            (online ? "online" : "offline"));
> +  if (online) {
> +    /* TODO this is also were could start gathering again to check for new

s/were/where we/

Also TODOs should have a bug # ideally.

::: media/mtransport/test/ice_unittest.cpp:3402
(Diff revision 4)
> +  p2_->ChangeNetworkStateToOnline();
> +  ASSERT_TRUE(p2_->ice_connected());
> +}
> +
> +TEST_F(WebRtcIceConnectTest, TestNetworkOnlineTriggersConsent) {
> +  // lets emulate audio + video w/o rtcp-mux

Let's
Flags: needinfo?(jib)
Comment on attachment 8812080 [details]
Bug 1318180: turn network offline events into ice disconnected event.

https://reviewboard.mozilla.org/r/93984/#review97618

> This is fine with me (I'm sure I do this elsewhere), though I know others disapprove of boolean args because their meaning is hard to deduce at the call site (going as far as landing ugly comments in-between args). I suppose you could avoid this by doing something like:
> 
>     updateOnlineState: function(online = true) { /* your code */ },
>     updateOfflineState: function() { return updateOnlineState(false); },
> 
> Up to you.

I totally agree that booleans are ugly to understand. But I don't really see any good alternative here. An Enum with names would be good. But I don't think that I can pass that through the WebIDL, or?
And only taking care of the JS side of this call makes not much sense to me. If it should be fixed it would need to be improved in JS, through PeerConnectionImpl, PeerConnectionMedia all the way down to nricectx.cpp.
Comment on attachment 8812080 [details]
Bug 1318180: turn network offline events into ice disconnected event.

https://reviewboard.mozilla.org/r/93984/#review97920

::: dom/webidl/PeerConnectionImpl.webidl:76
(Diff revision 5)
> +  void updateNetworkState(boolean online);
> +

Is there some way we can subscribe to this notification in C++ (say, from PeerConnectionCtx)?

::: media/mtransport/nricectx.cpp:974
(Diff revision 5)
>    }
>  
>    return NS_OK;
>  }
>  
> +void NrIceCtx::UpdateNetworkState(bool online) {

Let's pick either "Change" or "Update" as the verb for this stuff.

::: media/mtransport/third_party/nICEr/src/ice/ice_component.c:1259
(Diff revision 5)
> +
> +    r_log(LOG_ICE,LOG_WARNING,"ICE(%s)/STREAM(%s)/COMP(%d): component disconnected",
> +          comp->ctx->label, comp->stream->label, comp->component_id);
> +    comp->disconnected = 1;
> +
> +    // a single disconnected component disconnects the stream

c-style comment

::: media/mtransport/third_party/nICEr/src/ice/ice_media_stream.c:595
(Diff revision 5)
> +      if((comp->state != NR_ICE_COMPONENT_DISABLED) &&
> +         (comp->local_component->state != NR_ICE_COMPONENT_DISABLED) &&
> +         comp->disconnected)
> +        nr_ice_component_refresh_consent_now(comp);

Brackets would make this read easier.

::: media/mtransport/third_party/nICEr/src/ice/ice_media_stream.c:610
(Diff revision 5)
> +      if((comp->state != NR_ICE_COMPONENT_DISABLED) &&
> +         (comp->local_component->state != NR_ICE_COMPONENT_DISABLED))
> +        comp->disconnected = 1;

Same here.

::: media/mtransport/third_party/nICEr/src/ice/ice_media_stream.c:624
(Diff revision 5)
> +    if (stream->ice_state != NR_ICE_MEDIA_STREAM_CHECKS_CONNECTED) {
> +      return;
> +    }
> +    stream->disconnected = disconnected;

So I guess you're trying to prevent this from causing disconnected state changes to fire before ICE has completed the first time? Because it won't work if some streams have completed, but others have not.
Attachment #8812080 - Flags: review?(docfaraday)
Flags: needinfo?(docfaraday)
Comment on attachment 8812080 [details]
Bug 1318180: turn network offline events into ice disconnected event.

https://reviewboard.mozilla.org/r/93984/#review97920

> Let's pick either "Change" or "Update" as the verb for this stuff.

Did you mean the log messages (I updated that)?
If not I hope I was using UpdateNetworkState throughout already.

> So I guess you're trying to prevent this from causing disconnected state changes to fire before ICE has completed the first time? Because it won't work if some streams have completed, but others have not.

Right the intention here is to prevent firing disconnected before we connected for the first time. Although the spec apparently allows the transition from checking to disconnected I'm not convinced that nICEr could actually handle the switch back to checking,connected or failed yet
Comment on attachment 8812080 [details]
Bug 1318180: turn network offline events into ice disconnected event.

https://reviewboard.mozilla.org/r/93984/#review101884

::: dom/media/PeerConnection.js:456
(Diff revisions 2 - 7)
>  
>    getConfiguration: function() {
>      return this._config;
>    },
>  
> -  _initCertificate: function(certificates) {
> +  _initCertificate: async function(certificates = []) {

Is this stuff from a rebase?

::: dom/media/PeerConnection.js:605
(Diff revisions 2 - 7)
>          throw new this._win.DOMException(msg + " - malformed URI: " + uriStr,
>                                           "SyntaxError");
>        }
>      };
>  
> -    rtcConfig.iceServers.forEach(server => {
> +    var stunServers = 0;

Same question here.

::: dom/media/PeerConnection.js:728
(Diff revisions 2 - 7)
>                                return this.setEH(name, h);
>                              }
>                            });
>    },
>  
> -  _addIdentityAssertion: function(sdpPromise, origin) {
> +  createOffer: function(optionsOrOnSucc, onErr, options) {

More rebase stuff?

::: dom/media/PeerConnection.js
(Diff revision 7)
> -      // Also kill them if "Work Offline" is selected - more can be created
> -      // while offline, but attempts to connect them should fail.
> -      for (let winId in this._list) {
> -        cleanupWinId(this._list, winId);
> -      }
> -      this._networkdown = true;

_networkdown shows up in a couple more places, should those go too?

::: dom/media/PeerConnection.js:1356
(Diff revision 7)
>    // This method is primarily responsible for updating iceConnectionState.
>    // This state is defined in the WebRTC specification as follows:
>    //
>    // iceConnectionState:
>    // -------------------
> -  //   new           The ICE Agent is gathering addresses and/or waiting for
> +  //   new           Any of the RTCIceTransport s are in the new state and none

"RTCIceTransport s" here and in a few other places.

::: media/webrtc/signaling/src/peerconnection/PeerConnectionCtx.cpp:92
(Diff revision 7)
>        // Make sure we're not deleted while still inside ::Observe()
> -      RefPtr<PeerConnectionCtxShutdown> kungFuDeathGrip(this);
> -      PeerConnectionCtx::gPeerConnectionCtxShutdown = nullptr;
> +      RefPtr<PeerConnectionCtxObserver> kungFuDeathGrip(this);
> +      PeerConnectionCtx::gPeerConnectionCtxObserver = nullptr;
> +    }
> +    if (strcmp(aTopic, NS_IOSERVICE_OFFLINE_STATUS_TOPIC) == 0) {
> +#if !defined(MOZILLA_EXTERNAL_LINKAGE)

I think you might be able to avoid some of this #ifdef by using nsStringGlue.h

::: media/webrtc/signaling/src/peerconnection/PeerConnectionCtx.cpp:330
(Diff revision 7)
> +PeerConnectionCtx::UpdateNetworkState(bool online) {
> +  auto ctx = GetInstance();
> +  if (ctx->mPeerConnections.empty()) {
> +    return;
> +  }
> +  for (auto p = ctx->mPeerConnections.begin();

Will for (auto pc : ctx->mPeerConnections) work here?
Attachment #8812080 - Flags: review?(docfaraday)
Comment on attachment 8812080 [details]
Bug 1318180: turn network offline events into ice disconnected event.

https://reviewboard.mozilla.org/r/93984/#review101884

> Is this stuff from a rebase?

Yes. I haven't touched this.

> Same question here.

Yep, from rebase.

> More rebase stuff?

Indeed.
Comment on attachment 8812080 [details]
Bug 1318180: turn network offline events into ice disconnected event.

https://reviewboard.mozilla.org/r/93984/#review101884

> _networkdown shows up in a couple more places, should those go too?

Turns out that we actually still need _networkdown (so I undeleted the code), as Necko does not return any errors if we try to access the network in offline mode.
Comment on attachment 8812080 [details]
Bug 1318180: turn network offline events into ice disconnected event.

https://reviewboard.mozilla.org/r/93984/#review102080
Attachment #8812080 - Flags: review?(docfaraday) → review+
Pushed by drno@ohlmeier.org:
https://hg.mozilla.org/integration/autoland/rev/a20283c2bbea
turn network offline events into ice disconnected event. r=bwc,jib
https://hg.mozilla.org/mozilla-central/rev/a20283c2bbea
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
Duplicate of this bug: 1276559
You need to log in before you can comment on or make changes to this bug.