Closed Bug 1106675 Opened 5 years ago Closed 5 years ago

Replace peerConnection._queueOrRun with a promise-chain

Categories

(Core :: WebRTC, defect)

34 Branch
x86
macOS
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla37

People

(Reporter: jib, Assigned: jib)

References

(Depends on 1 open bug)

Details

Attachments

(1 file, 6 obsolete files)

Attached patch promiseQueue.patch (obsolete) — Splinter Review
From an idea in Bug 1091898 comment 35.

Try - https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=2ae87732f6ba

Works, thoug I'm getting some odd failures where DOMErrors get converted to Errors. Odd.

> 25 INFO TEST-UNEXPECTED-FAIL | /tests/dom/media/tests/mochitest/test_peerConnection_close.html | createOffer fails on close: Peer connection is closed - got Error, expected InvalidStateError
> 26 INFO TEST-UNEXPECTED-FAIL | /tests/dom/media/tests/mochitest/test_peerConnection_close.html | createAnswer fails on close: Peer connection is closed - got Error, expected InvalidStateError
> 27 INFO TEST-UNEXPECTED-FAIL | /tests/dom/media/tests/mochitest/test_peerConnection_close.html | setLocalDescription fails on close: Peer connection is closed - got Error, expected InvalidStateError
> 28 INFO TEST-UNEXPECTED-FAIL | /tests/dom/media/tests/mochitest/test_peerConnection_close.html | setRemoteDescription fails on close: Peer connection is closed - got Error, expected InvalidStateError
> 29 INFO TEST-UNEXPECTED-FAIL | /tests/dom/media/tests/mochitest/test_peerConnection_close.html | addIceCandidate fails on close: Peer connection is closed - got Error, expected InvalidStateError
Attachment #8531044 - Flags: feedback?(martin.thomson)
Comment on attachment 8531044 [details] [diff] [review]
promiseQueue.patch

Review of attachment 8531044 [details] [diff] [review]:
-----------------------------------------------------------------

I think that if you can work out where the errors are being transmuted here (I'm not your guy, sorry), then this is OK.  I don't think that I'd be happy to land this without going a little further in the conversion process, but it's a decent first step.

::: dom/media/PeerConnection.js
@@ +348,5 @@
>      this.makeGetterSetterEH("onidpassertionerror");
>      this.makeGetterSetterEH("onidpvalidationerror");
>  
>      this._pc = new this._win.PeerConnectionImpl();
> +    this._tail = new this._win.Promise(y => y());

Use Promise.resolve()

And I think that you need to ensure that this is named better than _tail.  _taskQueue would be consistent with the language in my PR.

@@ +395,2 @@
>      this._checkClosed();
> +    return this._tail = this._tail.then(() => new this._win.Promise(executor));

I think that this is fine if you are looking to minimize the scope of this particular change, but you should look to having each of the constituent functions be promise-returning functions too.  Maybe in a follow-on patch if you want to partition things more.

I don't like the shape of the call pattern.  See below.

@@ +583,1 @@
>      return this.thenCB(p, onSuccess, onError);

I think that you ultimately want to get to the point where you have:

return this._queue(() => this._createOffer(options));

That means changing all of the _foo analogues, but I think that it is better to use a consistent style throughout.

@@ -757,5 @@
>        // May be null if the user didn't supply success/failure callbacks.
>        // Violation of spec, but we allow it for now
>        onSuccess();
>        isDone = true;
> -      this._executeNext();

I think that we need to look harder at setRemoteDescription than this, maybe in a *third* patch :/.

@@ +858,4 @@
>      this._localIdp.close();
>      this._remoteIdp.close();
>      this._impl.close();
> +    this._closed = true;

If you look at the PR for this, we also need to cancel the execution of further promises.  I don't know if that is easy to do, maybe _queue() can add a .then(x => this._closed && throw) or something before each new item that is added.

@@ +1025,5 @@
>  
>    connectDataConnection: function(localport, remoteport, numstreams) {
> +    this._queue(y => {
> +      this._impl.connectDataConnection(localport, remoteport,
> +                                       ((numstreams > 0)? numstreams : 16));

(numstreams || 16) perhaps.
Attachment #8531044 - Flags: feedback?(martin.thomson) → feedback+
Incorporate feedback and work around Bug 1107592.

> That means changing all of the _foo analogues, but I think that it is better
> to use a consistent style throughout.

I agree.

> I think that we need to look harder at setRemoteDescription than this, maybe
> in a *third* patch :/.

I'd be happy to give that a shot. I held off because I thought from Bug 1091898 comment 12 that you had a patch in the works?

> > +    this._closed = true;
> 
> If you look at the PR for this, we also need to cancel the execution of
> further promises.  I don't know if that is easy to do, maybe _queue() can
> add a .then(x => this._closed && throw) or something before each new item
> that is added.

Good idea. I haven't added that here yet.

> >    connectDataConnection: function(localport, remoteport, numstreams) {
> > +    this._queue(y => {
> > +      this._impl.connectDataConnection(localport, remoteport,
> > +                                       ((numstreams > 0)? numstreams : 16));
> 
> (numstreams || 16) perhaps.

That's not the same, since negative numstreams should produce 16.
Assignee: nobody → jib
Attachment #8531044 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Depends on: 1091898
(In reply to Jan-Ivar Bruaroey [:jib] from comment #2)
> > If you look at the PR for this, we also need to cancel the execution of
> > further promises.  I don't know if that is easy to do, maybe _queue() can
> > add a .then(x => this._closed && throw) or something before each new item
> > that is added.
> 
> Good idea. I haven't added that here yet.

Actually, it turns out that that's what the patch already does since this._checkClosed() throws.
The problem is I think this is actually wrong, since if we're ever in a situation where it throws, the result will be that the promise is rejected, which quickly leads to all queued promise being rejected since they'll all throw in the same place now as each queued .then() function tries to execute. I don't think that's what the spec says to do.

I think the solution is to move the this._checkClosed() out to when things are _queue()ed, and to basically not test inside the .then(). What drives the queue is ultimately the c++ code, so once things are closed, the belt should just stop moving, which I think is what the spec wants (i.e. no callbacks executed once the pc is closed). Does that sound right?
Flags: needinfo?(martin.thomson)
Depends on: 1107592
Well, I think that it is OK.

Assuming:

pc.createOffer()
.then(o => pc.setLocalDescription(o))
.then(awaitAnswer, log)
.then(a => pc.setRemoteDescription(a))
.catch(log);

If close() is called while createOffer() is out doing things, the other calls are on the train already.  Moving _checkClosed() means that it doesn't get checked.  Having the enqueued operation throw is will cause any enqueued operation to be rejected, which is fine.  In this case, it will be caught twice: once next to awaitAnswer, once at the end.

As for what the spec says: https://github.com/w3c/webrtc-pc/pull/22

(And yay, I didn't use Abort like I told you I did; I used InvalidStateError)
Flags: needinfo?(martin.thomson)
(In reply to Martin Thomson [:mt] from comment #5)
> If close() is called while createOffer() is out doing things, the other calls are on the train already.

I think this assumption is wrong, because none of setLocalDescription, awaitAnswer or setRemoteDescription have been called yet at this point. What's on the train are functions that will call those in the future, iff createOffer succeeds. The internal _queue function has been called just once at this point.

As I understand it, if close() is called while createOffer() is in c++ land, then PeerConnection.js's internal onCreateOfferSuccess and onCreateOfferFailure functions will never get called, and the returned promise is never resolved or rejected, so setLocalDescription, awaitAnswer etc. are never called and no error is thrown.

This is desired behavior I believe.

---

Exercising the *internal* _queue still requires a whacky user calling several methods in a row without waiting for an async method to complete. E.g.

pc.createOffer()
.then(o => {
  pc.setLocalDescription(o); // waiting is for whimps!
  return awaitAnswer().then(a => pc.setRemoteDescription(a));
})
.catch(log);

Here, setRemoteDescription(a) will get _queued after setLocalDescription(o) if awaitAnswer() resolves quickly. I.e. the promise returned by setRemoteDescription(a) depends on the promise returned from setLocalDescriptio(o), even though the user never connected them in JS.

Lets assume that happens.

If close() is called while setLocalDescription(o) is in c++ land, then PeerConnection.js's internal onSetLocalDescriptionSuccess and _onSetLocalDescriptionFailure should never get called, and nothing else happens. The "setLocal" promise is never resolved or rejected, so the internal _setRemoteDescription never happens and no error is thrown.

The exact same behavior as above.

---

The _checkClose() on _queue (not in the .then() function) is to catch programming errors like content calling PeerConnection methods when the connection is closed:

pc.createOffer()
.then(o => {
  pc.setLocalDescription(o); // waiting is for whimps!
  pc.close(); // causes both setLocal and setRemote to throw.
  return awaitAnswer().then(a => pc.setRemoteDescription(a));
})
.catch(log);
(In reply to Jan-Ivar Bruaroey [:jib] from comment #6)
>   pc.close(); // causes both setLocal and setRemote to throw.

   pc.close(); // causes setRemote to throw.
Yes, that sounds more correct.  All except the part where calling close() while there is an unresolved promise.  That should cause the promise to be rejected too.  Basically, I don't want to end up in a state where promises hang indefinitely.  That's going to cause surprises.  Fixing that is going to be tricky so maybe we should open another bug for that.
That's the spec behavior. See text like:

"If this RTCPeerConnection object is closed before the SDP generation process completes, the USER agent MUST suppress the result and not call any of the result callbacks."

Which is good because otherwise all versions of Firefox would be broken. I don't think seeing promises as "hanging" is necessarily the right framing. Lets talk.
I stand by my assessment of the spec, but my statement about Firefox was wrong. Until the patch in this bug, we've been queuing close (yeah), which means close() in Firefox till now has no effect on outstanding calls whatsoever. Try it out in http://jsfiddle.net/jib1/48j95b2b
Comment on attachment 8535091 [details] [diff] [review]
replace _queueOrRun with a promise-chain (3)

Review of attachment 8535091 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/PeerConnection.js
@@ +348,5 @@
>      this.makeGetterSetterEH("onidpassertionerror");
>      this.makeGetterSetterEH("onidpvalidationerror");
>  
>      this._pc = new this._win.PeerConnectionImpl();
> +    this._taskChain = new this._win.Promise(resolve => resolve());

this._taskChain = Promise.resolve();

@@ +358,5 @@
>      this._winID = this._win.QueryInterface(Ci.nsIInterfaceRequestor)
>        .getInterface(Ci.nsIDOMWindowUtils).currentInnerWindowID;
>      _globalPCList.addPC(this);
>  
> +    this._initialize(rtcConfig);

Any reason why we don't just include _initialize directly here?

@@ +392,3 @@
>     */
> +  _queue: function(func) {
> +    // TODO: Add a this._checkClosed here as well once Bug 1107592 is fixed.

Is this "add" or "move"?

@@ +614,1 @@
>      return this.thenCB(p, onSuccess, onError);

I'll leave this to your discretion, but I think that you want to change this code even more...

-   let p = this._queue(() => this._createAnswer());
-   return this.thenCB(p, onSuccess, onError);	
+   return this._queue(() => this._createAnswer(), onSuccess, onError);

... throughout.

@@ +1030,5 @@
>      );
>      return channel;
>    },
>  
>    connectDataConnection: function(localport, remoteport, numstreams) {

Have you created a bug to remove this dead function?
Attachment #8535091 - Flags: review?(martin.thomson) → review+
Incorporate feedback and change this code even more.
Attachment #8535091 - Attachment is obsolete: true
Attachment #8535201 - Flags: review?(martin.thomson)
(In reply to Martin Thomson [:mt] from comment #12)
> Have you created a bug to remove this dead function?

Filed Bug 1110478.
Comment on attachment 8535201 [details] [diff] [review]
replace _queueOrRun with a promise-chain (4)

Review of attachment 8535201 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/PeerConnection.js
@@ +397,3 @@
>    },
>  
> +  _legacy: function(func) {

_wrapLegacyCallback or just _wrapCallback ?
Attachment #8535201 - Flags: review?(martin.thomson) → review+
Incorporate feedback. Carrying forward r=mt.
Attachment #8535201 - Attachment is obsolete: true
Attachment #8535297 - Flags: review+
No longer depends on: 1107592
Fix problem on try in comment 11. Needs re-review.

An InternalStateError bled from setRemoteDescription into addIceCandidate, because I wasn't being careful about isolating these operations from each other. This isn't a dependency queue.

This was intermittent because it depended on when ICE candidates arrived, and didn't happen locally. showing how rarely the queue kicks in to actually do something.
Attachment #8535297 - Attachment is obsolete: true
Attachment #8535575 - Flags: review?(martin.thomson)
Comment on attachment 8535575 [details] [diff] [review]
replace _queueOrRun with a promise-chain (6)

Review of attachment 8535575 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/PeerConnection.js
@@ +392,5 @@
> +    let p = this._taskChain.then(() => {
> +      this._checkClosed(); // TODO: Move outside promise once Bug 1107592 is fixed.
> +      return func();
> +    });
> +    this._taskChain = p.catch(() => {}); // don't propagate errors in taskChain!

Is there any reason not to log something here?  If we don't, there is a risk that a programming error in func() is hidden.  Obviously, this will catch all the usual failure cases for func(), which aren't always our fault (checkClosed(), for instance), but when it's a real bug, it's probably not great to be hiding it.

Maybe:
  this._taskChain = p.catch(e => this.logErrorAndCallOnError(e));

@@ +649,5 @@
>        default:
>          throw new this._win.DOMError("InvalidParameterError",
>              "Invalid type " + desc.type + " provided to setRemoteDescription");
>      }
> +    this._remoteType = desc.type;

You have deferred this a little, but aren't there plenty of other failure modes than just this simple type check?

Maybe set it at the end of _setRemoteDescription, which is pretty close to where it gets set in C++ land.

(Oh yeah, and that's what you get for trying to fix things, you end up having to fix them more thoroughly than you intended ;)
Attachment #8535575 - Flags: review?(martin.thomson) → review+
(In reply to Martin Thomson [:mt] from comment #19)
> > +    this._taskChain = p.catch(() => {}); // don't propagate errors in taskChain!
> 
> Is there any reason not to log something here?  If we don't, there is a risk
> that a programming error in func() is hidden.

It's hiding a clone [1]. Essentially, this._taskChain at this point (the promise from .catch()) is a separate "branch" of the promise chain from juncture p, and cannot affect p or anyone doing p.then() or p.catch(). It's a tree, so the rejection must be cloned essentially for each branch.

I've verified it here: http://jsfiddle.net/jib1/j306ubnd

> > +    this._remoteType = desc.type;
> 
> You have deferred this a little, but aren't there plenty of other failure
> modes than just this simple type check?
> 
> Maybe set it at the end of _setRemoteDescription, which is pretty close to
> where it gets set in C++ land.
> 
> (Oh yeah, and that's what you get for trying to fix things, you end up
> having to fix them more thoroughly than you intended ;)

Will do ;)

[1] https://developer.mozilla.org/en-US/docs/Mozilla/JavaScript_code_modules/Promise.jsm/Promise#then%28%29
(In reply to Jan-Ivar Bruaroey [:jib] from comment #20)
> It's hiding a clone [1]. Essentially, this._taskChain at this point (the
> promise from .catch()) is a separate "branch" of the promise chain from
> juncture p, and cannot affect p or anyone doing p.then() or p.catch(). It's
> a tree, so the rejection must be cloned essentially for each branch.

Yeah, I got that part.  But only this branch is under our control.  The other one just went to content-land.  If they catch and ignore, we have two dead clones.  I guess for unit/mochitests at least the trick is to avoid being sloppy about error handling.  But even there, we have a few cases where errors are expected.
Corollary: we don't report caught exceptions.
(In reply to Martin Thomson [:mt] from comment #19)
> > +    this._remoteType = desc.type;

On second thought, lets defer the scope-creep to a separate bug, as there are side-effects of moving it much later. As it stands now, content can read pc.localDescription synchronously right after having called setLocalDescription and get the result they are in the process of setting (!) - This seems quite wrong, but I'm not confident it's not some spec issue.
(In reply to Jan-Ivar Bruaroey [:jib] from comment #23)
> This seems quite wrong, but I'm not confident it's not some spec issue.

And yup, there it is. http://w3c.github.io/webrtc-pc/#attributes :

"The localDescription attribute MUST return the RTCSessionDescription that was most recently passed to setLocalDescription(), plus any local candidates that have been generated by the ICE Agent since then."

How do you interpret that statement?
Flags: needinfo?(martin.thomson)
Removed discussed moving of desc.type caching code from the patch - since it looks to deserve its own bug - to unblock landing. Carrying forward r=mt.
Attachment #8535575 - Attachment is obsolete: true
Attachment #8535732 - Flags: review+
I interpret that as a spec bug.  What possible value is there in surfacing something like this ?

pc.setLocalDescription(new RTCSessionDescription("cheese", "and ham")).catch(always => log(always));

Let's make sure that we fix the spec.  My concern here is primarily that we could experience desync.  I think that I would rather move the type thing into the C++ API (on JsepSession, probably) and properly encapsulate the response.

Agree with the separate bug thing.
Flags: needinfo?(martin.thomson)
I've filed Bug 1110961 for the desc.type issue.
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/8ff55dfa3607
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla37
Depends on: 1169386
You need to log in before you can comment on or make changes to this bug.