<a class="header-button" href="https://bugzilla.mozilla.org/home" title="Go to home page"> Bugzilla

Andrew Overholt [:overholt]

Assignee

Comment 7

•

7 years ago

Looks like we'll probably want to treat "failed" as "not active", but then ICE restart via pc.createOffer({iceRestart: true}) needs to be taken into account. The case of the RTCPeerConnection getting GCed before either getting to "failed" or "closed" needs solving, too.

Flags: needinfo?(overholt)

Comment 8

•

7 years ago

I was going to suggest asking :jesup about these things.

Assignee

Comment 9

•

7 years ago

What I've learned so far: 1) Currently, there exists IPeerConnectionManager (implemented in JS), which can be asked if a given inner window has any not-yet-GCed RTCPeerConnections. This is based on weak references. 2) Defining "active" for the purpose of back/forward cache and throttling as "an RTCPeerConnection exists" is rather unsatisfactory, because it means that e.g. https://www.sharedrop.io/ stays active forever after it has found a peer once even if the peer closes its window of otherwise just disappears. Therefore, it's probably desirable to treat the "closed" and "failed" states as "not active". 3) The current mechanism that requires a call into JS to query for the existence of active connections doesn't work for reasons farre documents in https://searchfox.org/mozilla-central/source/dom/base/TimeoutManager.cpp#1248. 4) To avoid having to call into JS when we want to know if there are "active" RTCPeerConnections, the obvious remedy is to maintain the state "has 'active' RTCPeerConnections" on the inner window itself and have the implementation of RTCPeerConnection update that state as appropriate. This way, the state can be read quickly in a C++-only manner. 5) I can't find a guarantee that an RTCPeerConnection would transition to the "closed" state prior to being garbage-collected if the JS on the page just drops a reference to a non-closed RTCPeerConnection. Therefore, there needs to be some mechanism to update the state on the inner window around the time an RTCPeerConnection gets GCed. 6) According to bz on dev-platform, we don't have finalizers for JS-based XPCOM objects, so I conclude that the case of an RTCPeerConnection transitioning to "not active" due to getting GCed should be handled by the destructor of a C++ object that gets deleted around the time the RTCPeerConnection gets GCed. 7) The C++ object should, therefore, maintain knowledge of whether it is responsible for telling the inner windows that it's no longer active. 8) It *appears* that each JS-based RTCPeerConnection instance is paired with a C++-based PeerConnectionImpl instance that actually connects to the underlying connection machinery. Therefore, it seems that the PeerConnectionImpl instance should have the role of the C++ object contemplated in points 6 and 7. 9) It appears that the connection state changes on PeerConnectionImpl happen off-the-main-thread and a runnable is posted to the main thread to update the state on the JS-based RTCPeerConnection object. Unresolved: * PeerConnectionImpl dispatches to its mThread for doing main-thread stuff instead of using a dispatch method that dispatches to the main thread. It's not clear to me what the significance of this is. * If the JS-based RTCPeerConnection had the responsibility of looking at transitions between the set of states that doesn't include "closed" and "failed" and the {"closed", "failed"} set of states in order to notify the inner window, it would have to sync back to PeerConnectionImpl the info on whether, if deleted now, the PeerConnectionImpl destructor would be responsible for notifying the inner window. * OTOH, if the notifying of the inner window lived completely in C++ in PeerConnectionImpl, it would be necessary to do one of: - Make the status on the inner window an atomic and update it from a non-main thread. - Add another set of runnables (another on top of the ones mentioned in point 9 above) for effecting the notification of the inner window from the main thread. - Make the runnables mentioned in point 9 carry extra info about the need to notify the inner window. I'm trying figure out which of these I should do. Using atomics without runnables would be less code, but I'm not sure if that would be problematic in another way.

Assignee

Comment 10

•

7 years ago

jesup, do you have guidance on what to do about the issues mentioned in the previous comment?

Flags: needinfo?(rjesup)

Eric Rescorla (:ekr)

Comment 11

•

7 years ago

" PeerConnectionImpl dispatches to its mThread for doing main-thread stuff instead of using a dispatch method that dispatches to the main thread. It's not clear to me what the significance of this is." The answer to this question is that we have (or at least had) tests which didn't really run the main Gecko event loop and so needed to be provided with a thread that "main" things could run on.

Comment 12

•

7 years ago

(from IRC): mThread goes back to the very beginning of the PeerConnectionImpl code, or close. Had to do archaeology on the mothballed copy of alder that it was developed on, but the mThread var was added by Anant in July of 2012, so that they could pass nullptr to Initialize() for some fakemediastream tests in signaling_unittests.cpp (and also since at that time, the gtest tests might have been running with a non-MainThread mThread). Nowadays, signaling_unittests.cpp calls NS_GetMainThread() and passes that in, so I think all support for passing in null threads has been removed (and even asserted against), though in theory one could pass a non-MainThread thread in to use for all the MainThread dispatches -- but likely this would break a ton. Since those tests now actually just use NS_GetMainThread to pass into Initialize(), so I think we can just remove the entire mThread stuff (perhaps it in theory has very small perf wins, but that's ok). Also note that there's a CheckThread/CheckThreadInt() call that checks if we're running on mThread, that would have to become a "check we're running on a virtual MainThread"

Flags: needinfo?(rjesup)

Comment 13

•

7 years ago

> 8) It *appears* that each JS-based RTCPeerConnection instance is paired with a C++-based PeerConnectionImpl instance that actually connects to the underlying connection machinery. Therefore, it seems that the PeerConnectionImpl instance should have the role of the C++ object contemplated in points 6 and 7. Correct jib: thoughts on atomics vs runanbles/etc? (jib implemented most of the PeerConnection.js code)

Flags: needinfo?(jib)

Eric Rescorla (:ekr)

Comment 14

•

7 years ago

Please do not remove the mThread. I would like to retain the ability to sometime break out PC from Gecko, and while many other things would be required, this is one of them, so removing it would be a step backward.

Comment 15

•

7 years ago

(In reply to Randell Jesup [:jesup] from comment #13) > jib: thoughts on atomics vs runnables/etc? I would piggyback on the PCObserverStateType::IceConnectionState runnable [1], by changing it to run a c++ method instead - e.g. PeerConnectionImpl::IceConnectionStateChange_m - on the main thread that does both your thing and calls PeerConnectionObserver::OnStateChange synchronously. That way it's all main-thread state. Note that "failed" is not necessarily terminal. [1] https://dxr.mozilla.org/mozilla-central/rev/5322c03f4c8587fe526172d3f87160031faa6d75/media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.cpp#3434 PS: I believe PeerConnectionImpl *should* GC whenever RTCPeerConnection is GC'ed, at least it did when I tried it in a simple test. We use a weakptr to the PeerConnectionObserver to avoid a cycle, since PeerConnectionImpl does not participate in cycle-collection (We had to pick between NS_DECL_THREADSAFE_ISUPPORTS and NS_DECL_CYCLE_COLLECTING_ISUPPORTS).

Flags: needinfo?(jib)

Assignee

Comment 16

•

7 years ago

jesup, ekr, jib: Thanks for the guidance. (In reply to Jan-Ivar Bruaroey [:jib] (needinfo? me) from comment #15) > Note that "failed" is not necessarily terminal. Is there a need for the RTCPeerConnection to process events from the network while in the "failed" state?

Flags: needinfo?(jib)

Assignee

Comment 17

•

7 years ago

(In reply to Henri Sivonen (:hsivonen) from comment #9) > * OTOH, if the notifying of the inner window lived completely in C++ in > PeerConnectionImpl, it would be necessary to do one of: > - Make the status on the inner window an atomic and update it from a > non-main thread. > - Add another set of runnables (another on top of the ones mentioned in > point 9 above) for effecting the notification of the inner window from the > main thread. > - Make the runnables mentioned in point 9 carry extra info about the need > to notify the inner window. Oops. My assumption that I'd have to start with a non-main thread seems wrong, since PeerConnectionImpl::IceConnectionStateChange runs on the main thread. I was fooled by it posting a runnable.

Comment hidden (mozreview-request)

Assignee

Comment 19

•

7 years ago

So this turned out to be much simpler than I thought in terms of threading, since I previously had a misunderstanding of PeerConnectionImpl threading. It's still a bit unclear to me if "failed" should count as "not active" for throttling and back/forward cache purposes, but it looks like connections on sharedrop.io don't transition (at least not quickly) to "failed" when the peer disappears to the point of the peer no longer showing up in the sharedrop.io UI. It could be that my pursuit of a solution that would make sharedrop.io after peer going away count as "not active" is misguided.

Comment hidden (mozreview-request)

Assignee

Updated

•

7 years ago

Attachment #8893774 - Flags: review?(jib)

Assignee

Comment 22

•

7 years ago

The patch assumes that 1) There is no need to process network events in the "failed" state. 2) That it's misguided to pursue a definition of "active" that makes sharedrop.io got to "not active" when its peer goes away. Tests are bug 1378402.

Comment 23

•

7 years ago

(In reply to Henri Sivonen (:hsivonen) from comment #16) > Is there a need for the RTCPeerConnection to process events from the network > while in the "failed" state? If you include an incoming offer on the signaling-channel, then yes, I believe so. From https://tools.ietf.org/html/rfc5245#section-2.6 "Once ICE is concluded, it can be restarted at any time for one or all of the media streams by either agent. This is done by sending an updated offer indicating a restart." If the other agent has reset ICE then an incoming offer is effectively new work that takes us out of "failed", I believe.

Flags: needinfo?(jib)

Comment 24

•

7 years ago

When the partner disappears, a peerconnection (a connected one) will *eventually* transition to the failed state on consent/keepalive failure. This will not be for a while, however (at least 30 seconds, perhaps more).

Flags: needinfo?(drno)

Comment 25

•

7 years ago

(In reply to Eric Rescorla (:ekr) from comment #14) > Please do not remove the mThread. I would like to retain the ability to > sometime break out PC from Gecko, and while many other things would be > required, this is one of them, so removing it would be a step backward. One issue with leaving mThread is that this forcing sync points into the virtual MainThread pool of threads (really coroutines). For DataChannelConnections, the pseudo-thread to use for "MainThread" is passed in from PeerConnectionImpl via 'mWindow->EventTargetFor(TaskCategory::Other)', so if we keep mThread, we should pass something like that in instead of GetMainThread().

Comment hidden (mozreview-request)

Assignee

Comment 27

•

7 years ago

(In reply to Jan-Ivar Bruaroey [:jib] (needinfo? me) from comment #23) > If the other agent has reset ICE then an incoming offer is effectively new > work that takes us out of "failed", I believe. OK. I didn't realize that ICE reset didn't need to be initiated by the peer itself. The latest patch doesn't try to count "failed" as active. In the new patch, only the state transitioning to "closed" or the connection getting GCed count as no longer active.

Comment 28

•

7 years ago

(In reply to Henri Sivonen (:hsivonen) from comment #27) > OK. I didn't realize that ICE reset didn't need to be initiated by the peer > itself. Well, "by itself" depends on what you mean. Technically the signaling-channel is outside the domain of WebRTC and RTCPeerConnection, in that the JS is expected to solve this with regular web tech. 99% of the time, the content script will use a websocket connection as its signaling channel. This might matter to you, if open websockets already mark the page as "active". For example, in a situation where the other peer initiates an ICE restart, the other side does something like this: await pc.setLocalDescription(await pc.createOffer({iceRestart: true})); webSocket.send({sdp: pc.localDescription}); which restarts ice locally on that end, and sends a new offer on the websocket signaling channel (shortly to be followed by ICE candidates on the same signaling channel), which are received by the other end e.g. like this: webSocket.onmessage = async ({data: {sdp, candidate}}) => { if (sdp) { await pc.setRemoteDescription(sdp); // offer w/ICE restart in this case ... } else if (candidate) await pc.addIceCandidate(candidate); } But users don't HAVE to use websockets. - They could short-circuit a local-loop demo with no outside influence https://jsfiddle.net/yxbLvjm6/ - They could use localStorage for a same-browser 2-tab demo https://jsfiddle.net/jib1/80xcgwmc/ - They could use a data channel from same (or other if "failed") connection: http://jsfiddle.net/jib1/7vxzbybp/ - They could use clipboard cut'n'paste to another machine somehow: https://jsfiddle.net/jib1/7vv2vxtt/ - They could rely on bluetooth, SMS, QR codes, or something else. So technically, "failed" might mean "not active" from the viewpoint of the RTCPeerConnection, but I don't know if we care about these esoteric cases.

Assignee

Comment 29

•

7 years ago

(In reply to Jan-Ivar Bruaroey [:jib] (needinfo? me) from comment #28) > (In reply to Henri Sivonen (:hsivonen) from comment #27) > > OK. I didn't realize that ICE reset didn't need to be initiated by the peer > > itself. > > Well, "by itself" depends on what you mean. I meant the PeerConnection itself causing the WebRTC machinery to send/receive packets. > Technically the > signaling-channel is outside the domain of WebRTC and RTCPeerConnection, in > that the JS is expected to solve this with regular web tech. 99% of the > time, the content script will use a websocket connection as its signaling > channel. This might matter to you, if open websockets already mark the page > as "active". Indeed, a WebSocket connection makes the page "active" for the purpose of throttling. Curiously, a WebSocket connection doesn't appear make the page ineligible to be put in the b/f cache, but that might itself be a bug (or there might be a less obvious mechanism than I can see). > So technically, "failed" might mean "not active" from the viewpoint of the > RTCPeerConnection, but I don't know if we care about these esoteric cases. I don't know if we should care.

Comment 30

•

7 years ago

mozreview-review

Comment on attachment 8893774 [details] Bug 1378123 - Make inner window track whether there is an active PeerConnection. https://reviewboard.mozilla.org/r/164874/#review174558 Missing mActiveOnWindow = false in one key place, looks good otherwise. Any way to test this? ::: media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.cpp:3410 (Diff revision 7) > + // Uncount this connection as active on the inner window upon close. > + if (mWindow && mActiveOnWindow && mIceConnectionState == PCImplIceConnectionState::Closed) { > + mWindow->RemovePeerConnection(); > + mActiveOnWindow = false; > + } Sorry, I gave incomplete advice earlier! Separate from this "closed" ICE state [1], which we unfortunately do not fire yet (bug 984580), we need to also do this for the "closed" signaling state [2] a.k.a. PCImplSignalingState::SignalingClosed, which is separate, to catch when JS calls pc.close(), which I believe we want to catch. [1] http://w3c.github.io/webrtc-pc/#dom-rtciceconnectionstate [2] FYI we're behind the spec (bug 1265827) which has now moved the "closed" signalingState to PeerConnectionState http://w3c.github.io/webrtc-pc/#dom-rtcpeerconnectionstate

Attachment #8893774 - Flags: review?(jib) → review-

Comment hidden (mozreview-request)

Assignee

Comment 32

•

7 years ago

mozreview-review-reply

Comment on attachment 8893774 [details] Bug 1378123 - Make inner window track whether there is an active PeerConnection. https://reviewboard.mozilla.org/r/164874/#review174558 I'm hoping that farre would add tests (and show generally how throttling should be tested) in bug 1378402. > Sorry, I gave incomplete advice earlier! > > Separate from this "closed" ICE state [1], which we unfortunately do not fire yet (bug 984580), we need to also do this for the "closed" signaling state [2] a.k.a. PCImplSignalingState::SignalingClosed, which is separate, to catch when JS calls pc.close(), which I believe we want to catch. > > [1] http://w3c.github.io/webrtc-pc/#dom-rtciceconnectionstate > [2] FYI we're behind the spec (bug 1265827) which has now moved the "closed" signalingState to PeerConnectionState http://w3c.github.io/webrtc-pc/#dom-rtcpeerconnectionstate I added code with the assumption that the way things work is that the connection becomes permanently inactive when either the signaling state or the ICE state becomes closed. Is that correct?

Comment 33

•

7 years ago

mozreview-review

Comment on attachment 8893774 [details] Bug 1378123 - Make inner window track whether there is an active PeerConnection. https://reviewboard.mozilla.org/r/164874/#review175322

Attachment #8893774 - Flags: review?(jib) → review+