Closed Bug 942188 Opened 6 years ago Closed 5 years ago

Firefox not handling either a=ice-lite in offer /answer or handling STUN error "487 Role conflict"

Categories

(Core :: WebRTC: Networking, defect, P1)

28 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla34

People

(Reporter: mmraju, Assigned: bwc)

References

Details

Attachments

(3 files, 12 obsolete files)

597 bytes, text/plain
Details
20.65 KB, patch
abr
: review+
abr
: checkin+
Details | Diff | Splinter Review
29.78 KB, patch
abr
: review+
RyanVM
: checkin+
Details | Diff | Splinter Review
Attached file term-sdp.txt
User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.57 Safari/537.36

Steps to reproduce:

Hi,

Our testing indicate that Firefox (Nightly or Aurora) do not support a=ice-lite in received SDP offer (on terminating side of the call) and switch ICE roles to "CONTROLLING". a=ice-lite is accepted at parsing level but roles are not changed per ice-lite request.
The middle box also sent "487 Role Conflict", which did not trigger Firefox role change either.

Please note that Chrome stable version does support both of these for a long time.

Procedure to reproduce:
- Send a=ice-lite in SDP offer to terminating browser (at session level).
   The middlebox expects Firefox to take ICE "CONTROLLING" role even though Firefox is on the terminating side (answerer) of the call.

term-sdp.txt has the SDP given to setLocalDescription() on term Firefox.
    I removed the actual ips in the SDP due to sensitive nature of it.
    The crucial piece of SDP is a=ice-lite.
    If I must I can share the pcap and other parts of javascript file. But, I am trying
    to how i can share the pcap without revealing the ip addresses (any tool?).
    Or you can ask for specific info on pcap STUN msgs, i can cut-paste parts of the
    file as needed. Sorry.
Firefox version use: Nightly 11-22-2013. Got similar results with Aurora.
Orig side Firefox behaved as expected.
No STUN server is used either on orig or on term.
Firefox orig to Chrome term calls were ok.
Issue is only with Firefox on the terminating side of the call.


Thanks
Raju



Actual results:

- Firefox continues to be in ICE "CONTROLLED" mode with a=ice-lite in SDP given to setRemoteDescription().
- Firefox continues to be in "ICE CONTROLLED" mode when middle box sent back
  "487 Role Conflict" to Firefox STUN binding req.
- when middle box sent STUN binding req with "CONTROLLED" mode Firefox sent SUCCESS initially followed by error (400 Bad Request).
- middle box tried not sending any STUN binding req after "487 Role Conflict" then Firefox continues to send ICE "CONTROLLED" mode binding req.


Expected results:

Firefox should switch to ICE "CONTROLLING" mode when:
- a=ice-lite is included in setLocalDescription
- or when it got 487 Role Conflict in response to CONTROLLED mode STUN req
>User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) >Chrome/31.0.1650.57 Safari/537.36

Even though the auto appended User Agent above shows the browser as Chrome (I opened ticket using Chrome) the actual browser I used for testing is Firefox Nightly.
Actual user agent is: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:28.0) Gecko/20100101 Firefox/28.0
Raju,

Could you please provide a URL and exact steps to reproduce this issue?
Flags: needinfo?(mmraju)
I guess it's an issue about WebRTC networking.
Component: Untriaged → WebRTC: Networking
Product: Firefox → Core
Alex,
>Could you please provide a URL and exact steps to reproduce this issue?
To reproduce the exact scenario you need middle box acting as an ice-lite agent towards the terminating Firefox... this middle box assumes "controlled" mode and also sends "487 Role Conflict" towards the term browser.
One way to achieve some what close to this middle box scenario without the actual middle box is.. through SDP modification reverse the roles of orig and term browsers to "controlled" and "controlling" respectively instead of default "controlling" and "controlled".
Procedure:
1. Orig/Offering side
   a. createOffer()
   b. insert a=ice-lite at session-level to the local SDP 
   c. setLocalDescription()
   Now orig should take "controlled" mode.

2. Term/Answer side
   a. setRemoteDescription() - the SDP should already have a=ice-lite from orig side.
   b. createAnswer()
   At this point this side should take "controlling" mode as orig side says it is "a=ice-lite"

3. Orig side
   setRemoteDescription() - no a=ice-lite should be present in this SDP.

Hope with this info you can reproduce the scenario!

Thanks for all the quick responses!
Raju
Flags: needinfo?(mmraju)
(In reply to Raju from comment #4)
> Alex,
> >Could you please provide a URL and exact steps to reproduce this issue?
> To reproduce the exact scenario you need middle box acting as an ice-lite
> agent towards the terminating Firefox... this middle box assumes
> "controlled" mode and also sends "487 Role Conflict" towards the term
> browser.
> One way to achieve some what close to this middle box scenario without the
> actual middle box is.. through SDP modification reverse the roles of orig
> and term browsers to "controlled" and "controlling" respectively instead of
> default "controlling" and "controlled".
> Procedure:
> 1. Orig/Offering side
>    a. createOffer()
>    b. insert a=ice-lite at session-level to the local SDP 
>    c. setLocalDescription()
>    Now orig should take "controlled" mode.

Firefox does not permit (nor does the spec currently require it to)
allow this type of SDP modification.

I'm not saying that this isn't a real bug, but this test case isn't
valid.




> 2. Term/Answer side
>    a. setRemoteDescription() - the SDP should already have a=ice-lite from
> orig side.
>    b. createAnswer()
>    At this point this side should take "controlling" mode as orig side says
> it is "a=ice-lite"
> 
> 3. Orig side
>    setRemoteDescription() - no a=ice-lite should be present in this SDP.
> 
> Hope with this info you can reproduce the scenario!
> 
> Thanks for all the quick responses!
> Raju
Eric,
Ok, I understand; in fact such modification of local sdp is not needed for our use case either.
How about the following scenario to basically push term FireFox into controlling mode and let 487 role conflict resolve the role?
Orig
   No changes
Term
   Add a=ice-lite at session level to received sdp before setRemote.

Thanks
Raju
See comment 6
Flags: needinfo?(ekr)
Yeah, this looks like a real bug. We are deciding on roles too early. Assigning to Byron.
Assignee: nobody → docfaraday
Flags: needinfo?(ekr)
Yep, definitely seeing that we do not handle 487 when we are not the controller. The code looks like it is handling the case where we are the controller. I'll patch that up first, and see if it causes the right behavior.
Initial patch. Fixed the location of the ice-lite checks to have the desired effect. Also fixed some bugs in the conflict resolution code (multiple 487 for the same role conflict would cause us to flip-flop, and the ICE-CONTROLLING/ICE-CONTROLLED attributes were not causing FROZEN/WAITING pairs to be reconfigured). Lastly, fixed a bug in ice_unittest that was causing the offerer to receive attributes before the answerer.
I should also note that there is a wrinkle around the controlled/controlled case and pre-answer request handling, but since this case is something that is currently thought to be impossible barring extreme confusion, this patch does not handle it.
Comment on attachment 8347560 [details] [diff] [review]
Various fixes to role-conflict resolution, and some test-cases.

Review of attachment 8347560 [details] [diff] [review]:
-----------------------------------------------------------------

It looks like this might handle bug 890667 too. I'll let you be the judge.
Attachment #8347560 - Flags: review?(ekr)
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Comment on attachment 8347560 [details] [diff] [review]
Various fixes to role-conflict resolution, and some test-cases.

Review of attachment 8347560 [details] [diff] [review]:
-----------------------------------------------------------------

On second thought, I'll give it one last proof-read before asking for review. Was getting kinda rushed toward the end.
Attachment #8347560 - Flags: review?(ekr) → feedback?(ekr)
Refinement after proofreading.
Attachment #8347560 - Attachment is obsolete: true
Attachment #8347560 - Flags: feedback?(ekr)
Small indentation fix.
Attachment #8347566 - Attachment is obsolete: true
Attachment #8347567 - Flags: review?(ekr)
After further thought, the handling of pre-answer requests troubles me; we will always accept such requests, so in the case of role conflict, the other agent can get some number of success responses, followed by a 487. I can see this confusing the other agent and causing interop problems. Since role-conflict can be detected without needing an answer, we probably need to perform it on pre-answer requests.
We also faced with same issue.
Our case pretty simple. FireFox connected to our media gateway (which can communicate with SIP provider). This gateway has only lite implementation of ICE. So, gateway inserts "a=ice-lite" in sdp when there is voip call to firefox from sip provider (INVITE from SIP to FireFox).
Ice lite means that agent with this mode doesn't send (actually can't send) any connectivity check and become ICE-CONTROLLED. Other side (firefox in our case) should send connectivity checks and become ICE-CONTROLLING when receive ice-lite from remote. 
So, I think that "487 role conflict" shouldn't be an issue in this case. Firefox should decide which role to chose just taking into account the presence of "a=ice-lite" attribute.
Hi all!
Are there any considerations about moment when these changes will become available in nightly/aurora?
Thank you!
Attachment #8347567 - Flags: review?(ekr) → review?(adam)
Hi!
As Anton we are also faced with the same issue and share his thoughts about the ice-lite attribute.
We are eager to try this out when it finally ends up in a build.
Thank you!
As far as I can tell there are actually two different issues here, 487 Role conflict handling and ice-lite attribute support. 

I've been testing firefox from the central and inbound mercurial repository and there really is no support of the ice-lite attribute in the SDP here. Parsing of a remote offer with ice-lite gives you a warning of "Warning: Unrecognized attribute (ice-lite)". Inbound calls from an ice-lite agent to firefox does not work since firefox always starts in controlled mode on inbound calls. 

Internally in my company we are running with a patched version of firefox to be able to support the ice-lite attribute and to start ice checks in controlling mode when the remote side is ice-lite. I've added the patch for this here, which is unrelated to the 487 Role conflict handling already fixed by Byron, and can be added independently or merged with that patch. Maybe should be a separate bug for this?
Attachment #8385475 - Flags: review?(adam)
(In reply to Anders Lund from comment #21)
> Created attachment 8385475 [details] [diff] [review]
> Flags: review?(adam@nostrum.com)

Byron already knows this, but I'm not going to have a chance to review ice patches until the week of March 17th (or maybe later). I just wanted to let you know that I've seen that you added a patch and am not ignoring it.
OS: Windows 7 → All
Hardware: x86_64 → All
Fix typos
Attachment #8385475 - Attachment is obsolete: true
Attachment #8385475 - Flags: review?(adam)
Attachment #8392695 - Flags: review?(adam)
Duplicate of this bug: 986348
Comment on attachment 8392695 [details] [diff] [review]
Added parsing of ice-lite attribute and start ice checks as controlling if peer is ice-lite

Review of attachment 8392695 [details] [diff] [review]:
-----------------------------------------------------------------

I had the same problem: A (non-browser) endpoint that runs ICE-LITE, so that - to interact with it - Firefox needs to be in a controlling ICE role.  Anders Lund's patch solved the problem for me.  Please commit this patch.
Comment on attachment 8392695 [details] [diff] [review]
Added parsing of ice-lite attribute and start ice checks as controlling if peer is ice-lite

Review of attachment 8392695 [details] [diff] [review]:
-----------------------------------------------------------------

Anders:

Sorry for the long time it took me to review this patch; things have been crazy busy for me recently.

Thanks so much for the patch! In general, it looks good. I have just a few small nits, plus a pattern change (basically, I don't want to have one-off handling for each new "flag" attribute that has no value). I'd also like to see the unit test updated to check that the controlling relationship is set up correctly (this is intended to protect against regressions). I'd like to take one more look at this before it lands.

General comment for clarity: please rename "peer_lite" to include "ice" where it appears (e.g., "ice_peer_lite" or "peer_ice_lite")

I also have a handful of formatting nits that I'd like to see handled in an updated patch.
gsm_sdp.c:7116: trailing whitespace
gsm_sdp.c:7118: Line is 86 characters long; please wrap to 80
lsm.c:3929: Line is 90 characters long; please wrap to 80
sdp.h:2052: trailing whitespace
sdp_attr.c:4761: Line is 90 characters long; please wrap to 80
sdp_attr.c:4765: Line is 87 characters long; please wrap to 80
sdp_attr.c:4767: trailing whitespace
sdp_attr_access.c:4026: trailing whitespace
sdp_attr_access.c:4037: Line is 82 characters long; please wrap to 80
sdp_private.h:216: replace tab character (+	sdp_t *sdp_p, sdp_attr_t *attr_p, flex_string *fs);)
sdp_private.h:218: replace tab character (+	sdp_t *sdp_p, sdp_attr_t *attr_p, const char *ptr);)
vcm.h:473: Line is 101 characters long; please wrap to 80
signaling_unittests.cpp:2220: Line is 89 characters long; please wrap to 80
signaling_unittests.cpp:2232: Line is 89 characters long; please wrap to 80
signaling_unittests.cpp:2237: trailing whitespace

::: media/webrtc/signaling/src/sipcc/core/sdp/sdp.h
@@ +2051,5 @@
>  sdp_result_e
> +sdp_attr_get_ice_lite_attribute (void *sdp_ptr, 
> +                                 u8 cap_num, sdp_attr_e sdp_attr, u16 inst_num,
> +                                 tinybool *ice_lite);
> +

See comments in sdp_attr_access.c

::: media/webrtc/signaling/src/sipcc/core/sdp/sdp_attr.c
@@ +4749,5 @@
>      return (SDP_SUCCESS);
>  }
>  
>  
> +sdp_result_e sdp_build_attr_ice_lite (sdp_t *sdp_p, sdp_attr_t *attr_p,

Rename this sdp_build_attr_simple_flag so it can be re-used for other value-less attributes.

@@ +4751,5 @@
>  
>  
> +sdp_result_e sdp_build_attr_ice_lite (sdp_t *sdp_p, sdp_attr_t *attr_p,
> +                                      flex_string *fs) {
> +    flex_string_append(fs, "a=ice-lite\r\n");

flex_string_sprintf(fs, "a=%s\r\n", sdp_get_attr_name(attr_p->type));

@@ +4758,5 @@
> +}
> +
> +
> +sdp_result_e sdp_parse_attr_ice_lite (sdp_t *sdp_p, sdp_attr_t *attr_p, const char *ptr) {
> +    attr_p->attr.boolean_val = TRUE;

This is a bit confusing; everywhere else it is used, the "boolean_val" flag is for attributes of the form "a=foo:0" or "a=foo:1". (I note that rtcp-mux varies from this pattern, but that change was made in error, and should have been caught in review.)

The model to follow for parsing an "is it there or not?" attribute is the one shown in sdp_parse_attr_direction (basically: don't set anything on the attribute).

Congruent with my comments above, rename this "sdp_parse_attr_simple_flag" so it can be re-used.

::: media/webrtc/signaling/src/sipcc/core/sdp/sdp_attr_access.c
@@ +4009,5 @@
>      sstrncpy(attr_p->attr.ice_attr, ice_attrib, sizeof(attr_p->attr.ice_attr));
>      return (SDP_SUCCESS);
>  }
>  
> +/* Function:    sdp_attr_get_ice_lite_attribute

For these kinds of valueless-attributes, I don't think we want to go down the path of defining a new accessor for each new attribute kind. The first observation I'll make is that the code below is functionally the same as sdp_attr_get_simple_boolean; but I don't think we should be storing a "true/false" for attributes that don't have an explicit value.

Instead, this should be something more generic that we can use for all value-less attributes going forward, loosely modeled on sdp_attr_get_simple_boolean; something like:

> tinybool sdp_attr_is_present (void *sdp_ptr, u16 level,
>                               u8 cap_num, sdp_attr_e sdp_attr);

::: media/webrtc/signaling/test/signaling_unittests.cpp
@@ +2235,5 @@
> +  std::cout << "Setting offer to:" << std::endl << indent(offer) << std::endl;
> +  a2_->SetRemote(TestObserver::OFFER, offer);
> +  
> +  std::cout << "Creating answer:" << std::endl;
> +  a2_->CreateAnswer(constraints, offer, OFFER_AUDIO | ANSWER_AUDIO);

I don't think this test checks any of the new code you've written, other than verifying that it doesn't crash or otherwise cause a catastrophic error.

Minimally, this test needs to check that the local ICE stack is controlling (check a2_->pc->media()->ice_ctx()->GetControlling(), where you'll have to add GetControlling() to NrIceCtx in nricectx.h)

I think you'll have to do a "SetLocal" on a2_ as well, in order to make sure this flag actually gets set.

Additionally, please consider adding a test to sdp_unittests.cpp for the new attribute handling.
Attachment #8392695 - Flags: review?(adam) → review-
Thank you for the feedback. Updated patch according to comments.

We have now a generic sdp_attr_is_present() and sdp_parse_attr_simple_flag() and sdp_build_attr_simple_flag(). Be aware though the the new simple_flag functions are equal to the attr_direction functions in sdp_attr.c. A later patch should update these to use the new functions.

Also, this is somewhat irrelevant for the webrtc code, but a common way of expressing boolean values in the SDP are by inclusion or exclusion only. The only code that uses the sdp_parse_attr_simple_bool or sdp_build_attr_simple_bool is in the T38Fax attribute handling. (And the TMRGwXid attribute which I have no idea what is.)

The building of the T38Fax boolean values here is really not in accordance to RFC 4612 or the ITU-T T.38 Recommendation which states that these attributes are valueless and their presence indicates that they are valid for the session.
Attachment #8392695 - Attachment is obsolete: true
Attachment #8420960 - Flags: review?(adam)
Comment on attachment 8420960 [details] [diff] [review]
Added parsing of ice-lite attribute and start ice checks as controlling if peer is ice-lite

Review of attachment 8420960 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks. Looks good. I have three nits with the new code (the one in nricectx is pretty important as nits go), after which I think it's ready to land.

::: media/mtransport/nricectx.cpp
@@ +512,5 @@
>    return NS_OK;
>  }
>  
> +NrIceCtx::Controlling NrIceCtx::GetControlling() {
> +  return peer_->controlling == 1 ? ICE_CONTROLLING : ICE_CONTROLLED;

nICEr uses integers as booleans, but evaluates them with normal truth/false handling:

return (peer_->controlling) ? ICE_CONTROLLING : ICE_CONTROLLED;

::: media/webrtc/signaling/src/sipcc/core/sdp/sdp.h
@@ +1259,5 @@
>  extern sdp_result_e sdp_attr_set_simple_boolean(void *sdp_ptr,
>                     sdp_attr_e attr_type, u16 level, u8 cap_num,
>                     u16 inst_num, u32 bool_parm);
> +extern tinybool sdp_attr_is_present (void *sdp_ptr, sdp_attr_e attr_type,
> +                                     u16 level, u8 cap_num, u16 inst_num);

inst_num doesn't make a lot of sense here, but it doesn't really cause any harm either. Consider taking it out.

::: media/webrtc/signaling/test/signaling_unittests.cpp
@@ +2537,5 @@
> +  std::cout << "Creating answer:" << std::endl;
> +  a2_->CreateAnswer(constraints, offer, OFFER_AUDIO | ANSWER_AUDIO);
> +  a2_->SetLocal(TestObserver::ANSWER, a2_->answer());
> +
> +  ASSERT_EQ(a2_->pc->media()->ice_ctx()->GetControlling(), NrIceCtx::ICE_CONTROLLING);

Consider wrapping this line to 80 columns.
Attachment #8420960 - Flags: review?(adam) → review+
Attached patch icelite.patchSplinter Review
Fixed remaining nits.
Attachment #8420960 - Attachment is obsolete: true
Attachment #8423744 - Flags: review?(adam)
Comment on attachment 8423744 [details] [diff] [review]
icelite.patch

Review of attachment 8423744 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good -- r+
Attachment #8423744 - Flags: review?(adam) → review+
(In reply to Adam Roach [:abr] from comment #30)
> Comment on attachment 8423744 [details] [diff] [review]
> icelite.patch
> 
> Review of attachment 8423744 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Looks good -- r+

I can not mark this as checkin-needed since I'm not assigned to this bug, and since this is independent of the 487 Role Conflict handling patch this should be able to land.
(In reply to Anders Lund from comment #31)
> (In reply to Adam Roach [:abr] from comment #30)
> > Comment on attachment 8423744 [details] [diff] [review]
> > icelite.patch
> > 
> > Review of attachment 8423744 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > Looks good -- r+
> 
> I can not mark this as checkin-needed since I'm not assigned to this bug,
> and since this is independent of the 487 Role Conflict handling patch this
> should be able to land.

Okay; I'll run it through its paces and then check it in.
We don't want to close this until the 487 patch lands too.
Whiteboard: [leave-open]
We want to get the rest of this into Fx 32 or Fx 33 at the latest.
Priority: -- → P1
Target Milestone: --- → mozilla33
Byron -- What else remains to land the 487 patch?  (Is there any work needed beyond landing that patch in order to close this bug?) Thanks.
Flags: needinfo?(docfaraday)
I'll need to make sure it hasn't rotted, and then we'll need to get it reviewed. So it is hard to say.
Flags: needinfo?(docfaraday)
Attachment #8347567 - Attachment is obsolete: true
Attachment #8347567 - Flags: review?(adam)
Comment on attachment 8440137 [details] [diff] [review]
Various fixes to role-conflict resolution, and some test-cases.

Review of attachment 8440137 [details] [diff] [review]:
-----------------------------------------------------------------

https://tbpl.mozilla.org/?tree=Try&rev=7494d13032a6
Attachment #8440137 - Flags: review?(adam)
Comment on attachment 8440137 [details] [diff] [review]
Various fixes to role-conflict resolution, and some test-cases.

Review of attachment 8440137 [details] [diff] [review]:
-----------------------------------------------------------------

I have a handful of nits and slightly-larger-than-nits below. The needed changes are just barely big enough that I'd like to see this again, but it's generally correct and just needs some polish.

FWIW, the unit tests have bitrotted.

I know that the nicer code is not terribly consistent about this, but it would be nice if new code followed some convention around spacing on conditionals. The following would be consistent with the rest of the moz code:

ice_candidate_pair.c:387: Space between 'if' and parenthesis
ice_candidate_pair.c:586: Space between 'if' and parenthesis
ice_media_stream.c:883: Space between 'while' and parenthesis
ice_peer_ctx.c:74: Space between 'if' and parenthesis
ice_peer_ctx.c:363: Space between 'if' and parenthesis
ice_peer_ctx.c:735: Space between 'while' and parenthesis

::: media/mtransport/test/ice_unittest.cpp
@@ +968,5 @@
>    }
>  
>    void Connect() {
> +    // Connect grabs attributes from the passed arg, and gives them to this,
> +    // making the passed arg the offerer.

I really don't understand this comment.

@@ +974,1 @@
>      p1_->Connect(p2_, TRICKLE_NONE);

If these are sensitive to ordering, please add a comment explaining why. It's not obvious to me why you needed to reorder all of the Connect calls.

@@ +1313,5 @@
> +  ASSERT_TRUE(Gather(true));
> +  p1_->SetControlling(NrIceCtx::ICE_CONTROLLING);
> +  p2_->SetControlling(NrIceCtx::ICE_CONTROLLING);
> +  Connect();
> +}

It seems we would want to also test the ConnectTrickle variations here, right?

::: media/mtransport/third_party/nICEr/src/ice/ice_candidate_pair.c
@@ +205,2 @@
>  
> +          int controlling = pair->pctx->controlling ? 0 : 1;

This is fundamentally a boolean:

  int controlling = !(pair->pctx->controlling);

@@ +340,5 @@
>    abort:
>      return;
>    }
>  
> +static void nr_ice_candidate_pair_restart(nr_ice_peer_ctx *pctx, nr_ice_cand_pair *pair)

static int ...

@@ +374,5 @@
>        /* Don't fire the CB, but schedule it to fire ASAP */
>        assert(!pair->stun_cb_timer);
>        NR_ASYNC_TIMER_SET(0,nr_ice_candidate_pair_stun_cb,pair, &pair->stun_cb_timer);
>        _status=0;
>      }

return (_status);

@@ +386,5 @@
> +    /* Register the stun ctx for when responses come in*/
> +    if(r=nr_ice_socket_register_stun_client(pair->local->isock,pair->stun_client,&pair->stun_client_handle))
> +      ABORT(r);
> +
> +    nr_ice_candidate_pair_restart(pctx, pair);

if (r=nr_ice_candidate_pair_restart(pctx, pair))
  ABORT(r);

@@ -553,2 @@
>      nr_ice_cand_pair *pair=cb_arg;
> -    int r,_status;

No, put this handling back in. The ABORT() macro isn't just a handy set/goto combination -- it also provides error logging. Without this, failures will be silent and difficult to diagnose.

@@ +574,2 @@
>  
> +    nr_ice_candidate_pair_restart(pair->pctx, pair);

And, of course, check the return code here.

@@ -562,5 @@
> -
> -    if(r=nr_stun_client_start(pair->stun_client,NR_ICE_CLIENT_MODE_BINDING_REQUEST,nr_ice_candidate_pair_stun_cb,pair))
> -      ABORT(r);
> -
> -    if(r=nr_ice_ctx_remember_id(pair->pctx->ctx, pair->stun_client->request))

Why is this no longer necessary?

@@ +581,5 @@
>  
> +    if (pair->state == NR_ICE_PAIR_STATE_IN_PROGRESS) {
> +      /* We could try only restarting in-progress pairs when they receive their
> +       * 487, but this ends up being simpler, because any extra 487 are dropped.
> +       * */

nit: "*/", not "* */"

::: media/mtransport/third_party/nICEr/src/ice/ice_media_stream.h
@@ +95,5 @@
>  int
>  nr_ice_peer_ctx_parse_media_stream_attribute(nr_ice_peer_ctx *pctx, nr_ice_media_stream *stream, char *attr);
>  int nr_ice_media_stream_disable_component(nr_ice_media_stream *stream, int component_id);
>  int nr_ice_media_stream_pair_new_trickle_candidate(nr_ice_peer_ctx *pctx, nr_ice_media_stream *pstream, nr_ice_candidate *cand);
> +void nr_ice_media_stream_set_role(nr_ice_media_stream *stream);

This name is quite confusing, since it's not setting the role per se; it's reacting to a change. Suggest renaming something like nr_ice_media_stream_role_change()

::: media/mtransport/third_party/nICEr/src/ice/ice_peer_ctx.c
@@ +363,5 @@
> +      if(pctx->ctx->flags & NR_ICE_CTX_FLAGS_LITE){
> +        r_log(LOG_ICE,LOG_ERR,"Both sides are ICE-Lite");
> +        ABORT(R_BAD_DATA);
> +      }
> +      pctx->controlling = 1;

Just as a matter of form, shouldn't this be nr_ice_peer_ctx_set_controlling()? I know we're not in a state that will trigger any special handling, but it does seem more tidy to treat this as being encapsulated *everywhere* if you're treating it as encapsulated anywhere.

::: media/mtransport/third_party/nICEr/src/ice/ice_peer_ctx.h
@@ +86,5 @@
>  int nr_ice_peer_ctx_deliver_packet_maybe(nr_ice_peer_ctx *pctx, nr_ice_component *comp, nr_transport_addr *source_addr, UCHAR *data, int len);
>  int nr_ice_peer_ctx_disable_component(nr_ice_peer_ctx *pctx, nr_ice_media_stream *lstream, int component_id);
>  int nr_ice_peer_ctx_pair_new_trickle_candidate(nr_ice_ctx *ctx, nr_ice_peer_ctx *pctx, nr_ice_candidate *cand);
> +void nr_ice_peer_ctx_set_controlling(nr_ice_peer_ctx *pctx, int controlling);
> +int nr_ice_peer_ctx_set_ice_lite(nr_ice_peer_ctx *pctx);

I find no implementation for this function -- please remove it.
Attachment #8440137 - Flags: review?(adam) → review-
Target Milestone: mozilla33 → mozilla34
Incorporate feedback. Will go over the patches one more time since it has been so long.
Attachment #8463630 - Attachment is obsolete: true
Simplifying/consolidating some logic, and improving a function name.
Attachment #8464077 - Attachment is obsolete: true
The stuff I tried uploading in the last patch.
Attachment #8464247 - Attachment is obsolete: true
Missed a return statement.
Attachment #8464256 - Attachment is obsolete: true
Attachment #8464259 - Attachment is obsolete: true
Comment on attachment 8464265 [details] [diff] [review]
Various fixes to role-conflict resolution, and some test-cases.

Review of attachment 8464265 [details] [diff] [review]:
-----------------------------------------------------------------

Interdiff: https://bugzilla.mozilla.org/attachment.cgi?oldid=8463630&action=interdiff&newid=8464265&headers=1

Simplified and consolidated some code, added code to break a loop when the peer flip-flops between controlling and controlled, and some renaming to make the code easier to read.
Attachment #8464265 - Flags: review?(adam)
Comment on attachment 8464265 [details] [diff] [review]
Various fixes to role-conflict resolution, and some test-cases.

Review of attachment 8464265 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good. Thanks.
Attachment #8464265 - Flags: review?(adam) → review+
Attachment #8440137 - Attachment is obsolete: true
https://tbpl.mozilla.org/?tree=Try&rev=9ec336527676
Flags: needinfo?(docfaraday)
Try looks good, although the win8 tests still haven't run.
Flags: needinfo?(docfaraday)
Keywords: checkin-needed
Whiteboard: [leave-open]
Attachment #8464265 - Flags: checkin+
https://hg.mozilla.org/mozilla-central/rev/9488dace0865
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Duplicate of this bug: 890667
See Also: 1184540
You need to log in before you can comment on or make changes to this bug.