Closed Bug 891551 Opened 11 years ago Closed 9 years ago

WebRTC: Add support for TCP ICE candidates

Categories

(Core :: WebRTC: Networking, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla41
Tracking Status
firefox31 --- affected
firefox32 --- affected

People

(Reporter: tom, Assigned: drno)

References

(Blocks 3 open bugs)

Details

(Keywords: dev-doc-complete)

Attachments

(18 files, 49 obsolete files)

1.60 KB, patch
bwc
: review+
RyanVM
: checkin+
Details | Diff | Splinter Review
14.94 KB, patch
bwc
: review+
jesup
: checkin+
Details | Diff | Splinter Review
1.07 KB, patch
bwc
: review+
RyanVM
: checkin+
Details | Diff | Splinter Review
8.52 KB, patch
wtc
: review+
wtc
: checkin+
Details | Diff | Splinter Review
886 bytes, patch
Details | Diff | Splinter Review
5.13 KB, patch
bwc
: review+
Details | Diff | Splinter Review
8.44 KB, patch
bwc
: review+
Details | Diff | Splinter Review
5.92 KB, patch
drno
: review+
Details | Diff | Splinter Review
3.34 KB, patch
Details | Diff | Splinter Review
19.04 KB, patch
Details | Diff | Splinter Review
157.87 KB, patch
drno
: review+
Details | Diff | Splinter Review
40 bytes, text/x-review-board-request
jesup
: review+
Details
40 bytes, text/x-review-board-request
Details
40 bytes, text/x-review-board-request
Details
40 bytes, text/x-review-board-request
Details
40 bytes, text/x-review-board-request
jesup
: review+
Details
40 bytes, text/x-review-board-request
Details
40 bytes, text/x-review-board-request
Details
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.71 Safari/537.36

Steps to reproduce:

Browsing the code it appears that Firefox doesn't support TCP ICE candidates. These are useful for environments that block UDP. See https://groups.google.com/d/msg/mozilla.dev.media/ow6KhnBVUKA/Kzl7RuH6QUQJ.
Component: Untriaged → WebRTC: Networking
OS: Mac OS X → All
Product: Firefox → Core
Hardware: x86 → All
Whiteboard: [WebRTC]
Plus it for v1.3 since it block the webRTC committed feature on audio P2P
blocking-b2g: --- → 1.3+
Whiteboard: [WebRTC] → [WebRTC][ft:webrtc]
Ivan, this feature is not scheduled for 1.3 nor required for audio p2p. We are doing TURN TCP instead.
Hi Eric,

Thank you for the clarification. Remove it from 1.3 blocker then How about 906968? It seems it won't be the blocker of 1.3 either.
blocking-b2g: 1.3+ → ---
Attached patch ICE-TCP WIP patch (obsolete) — — Splinter Review
WIP for ICE-TCP
Flags: sec-review?
Flags: sec-review? → sec-review?(cdiehl)
Attached patch ICE-TCP patch (obsolete) — — Splinter Review
This patch implements ICE-TCP, few TCP related tests and a testing STUN TCP server.  Current default STUN server used in tests(23.21.150.121) doesn't support TCP, therefore tests last longer. It works fine with these environment settings:
  export STUN_SERVER_ADDRESS=107.23.150.92
  export STUN_SERVER_HOSTNAME=stun.stunprotocol.org

Changes:
  - added accept and listen functions to nr_socket interface
  - increased nr_socket_vtbl version to 2
  - added a new nr_socket implementation  nr_multi_tcp_socket
      - simulates UDP socket behavior to upper layer
      - enables determining srflx addresses for passive candidates,
        where one socket listens on the same address as it is used in sockets connecting to stun servers as source address.
  - added a new nr_socket implementation nr_socket_tcp_framed 
       - implements framing mechanism on tcp socket - this is used when communicating between candidates
       - nr_socket_buffered_stun is used in communication with STUN servers - it doesn't use framing mechanism in this case
   - multi_tcp_unittests - unit tests for nr_multi_tcp_socket and nr_socket_tcp_framed 
   - added nr_socket_tcp_type to nr_ice_candidate 
       - ACTIVE, PASSIVE, SO (Simultaneous open)
   - candidate priority computation
       - added direction priority based on tcp_type
   - TCP candidates pairing ACTIVE - PASSIVE, SO - SO
   - encoding and parsing of tcptype in a candidate attribute
   - Added TCP variant of tests into ice_unittests
   - Added usage of TCP together with UDP STUN server into signaling_unittests
Attachment #8380049 - Attachment is obsolete: true
Attachment #8391222 - Flags: review?(ekr)
Review?
Attachment #8391222 - Flags: review?(ekr) → review?(docfaraday)
Comment on attachment 8391222 [details] [diff] [review]
ICE-TCP patch

Review of attachment 8391222 [details] [diff] [review]:
-----------------------------------------------------------------

Broadly speaking, the amount of copy/paste code needs to be reduced, there are a number of places where error handling needs to be improved, and a handful of other relatively serious problems. Please fix this up, and I can do a re-review.

::: media/mtransport/nr_socket_prsock.cpp
@@ +408,5 @@
>  
>    PRStatus status;
>    PRNetAddr naddr;
>  
> +  PRSocketOptionData option;

We're reusing this in the TCP case; need to determine if this is risky now. (At any rate, this opens the door to problems later)

@@ +716,5 @@
> +
> +  if((r=nr_praddr_to_transport_addr(&nfrom,addrp,my_addr_.protocol,0)))
> +    ABORT(r);
> +
> +  sock = new NrSocket();

Pretty sure this will leak on errors below.

@@ +727,5 @@
> +  option.value.non_blocking = PR_TRUE;
> +  status = PR_SetSocketOption(ret, &option);
> +  if (status != PR_SUCCESS) {
> +    r_log(LOG_GENERIC, LOG_CRIT, "Couldn't make socket nonblocking");
> +    ABORT(R_INTERNAL);

We leak sock.

@@ +732,5 @@
> +  }
> +
> +  // Remember our thread.
> +  sock->ststhread_ = do_QueryInterface(stservice, &rv);
> +  if (!NS_SUCCEEDED(rv))

NS_FAILED? (Lots of others)

@@ +733,5 @@
> +
> +  // Remember our thread.
> +  sock->ststhread_ = do_QueryInterface(stservice, &rv);
> +  if (!NS_SUCCEEDED(rv))
> +    ABORT(R_INTERNAL);

We leak sock.

@@ +738,5 @@
> +
> +  // Finally, register with the STS
> +  rv = stservice->AttachSocket(ret, sock);
> +  if (!NS_SUCCEEDED(rv)) {
> +    ABORT(R_INTERNAL);

We probably leak sock.

@@ +744,5 @@
> +
> +  sock->connect_invoked_ = true;
> +
> +  r = nr_socket_create_int(static_cast<void *>(sock),
> +                           sock->vtbl(), sockp);

Here's our ownership transfer for sock, right?

@@ +751,5 @@
> +
> +  // Add a reference so that we can delete it in destroy()
> +  sock->AddRef();
> +    _status=0;
> +  abort:

Weird enough indent to impact readability here.

@@ +929,5 @@
>      ABORT(R_INTERNAL);
>    }
>  
> +  if (addr->protocol != IPPROTO_UDP)
> +    ABORT(R_FAILED);

Better error code than this?

@@ +1095,5 @@
> +  MOZ_ASSERT(false);
> +  return R_INTERNAL;
> +}
> +
> +int NrSocketIpc::accept( nr_transport_addr *addrp, nr_socket **sockp) {

Weird ws

@@ +1181,5 @@
>  static int nr_socket_local_write(void *obj,const void *msg, size_t len,
>                                   size_t *written);
>  static int nr_socket_local_read(void *obj,void * restrict buf, size_t maxlen,
>                                  size_t *len);
> +static int nr_socket_local_listen(void *obj, int backlog);

Should backlog be a size_t?

::: media/mtransport/nr_socket_prsock.h
@@ +163,5 @@
>    virtual int connect(nr_transport_addr *addr);
>    virtual int write(const void *msg, size_t len, size_t *written);
>    virtual int read(void* buf, size_t maxlen, size_t *len);
> +  virtual int listen(int backlog);
> +  virtual int accept( nr_transport_addr *addrp, nr_socket **sockp);

Weird ws

@@ +217,5 @@
>    virtual int connect(nr_transport_addr *addr);
>    virtual int write(const void *msg, size_t len, size_t *written);
>    virtual int read(void* buf, size_t maxlen, size_t *len);
> +  virtual int listen(int backlog);
> +  virtual int accept( nr_transport_addr *addrp, nr_socket **sockp);

Weird ws

::: media/mtransport/nricectx.cpp
@@ +397,5 @@
>        NR_reg_set_uchar((char *)"ice.pref.interface.wlan0", 232);
>      }
>  
>      NR_reg_set_uint4((char *)"stun.client.maximum_transmits",4);
> +    NR_reg_set2_int4((char *)NR_ICE_REG_ICE_TCP_PRFX, NR_ICE_REG_ICE_TCP_SO_SOCK_COUNT, 3);

We cast one but not the other. What's up with that?

::: media/mtransport/nricectx.h
@@ +107,3 @@
>  
>   protected:
> +  NrIceStunServer(const char *transport) : addr_(), transport_(transport) {}

Make this explicit.

::: media/mtransport/nricemediastream.h
@@ +90,4 @@
>    NrIceAddr cand_addr;
>    NrIceAddr local_addr;
>    Type type;
> +  TcpType tcp_type;

You'll need to make the comparator in ice_unittest sensitive to this (grep for "so we can use stl containers/algorithms that need a comparator")

::: media/mtransport/test/buffered_stun_socket_unittest.cpp
@@ +86,5 @@
> +  virtual int listen(int backlog) {
> +    return 0;
> +  }
> +
> +  virtual int accept( nr_transport_addr *addrp, nr_socket **sockp) {

Weird ws

::: media/mtransport/test/ice_unittest.cpp
@@ +194,2 @@
>      stun_servers.push_back(*server);
>      ASSERT_TRUE(NS_SUCCEEDED(ice_ctx_->SetStunServers(stun_servers)));

Call function defined below?

@@ +398,5 @@
>      };
>  
> +    switch(cand.tcp_type) {
> +      case NrIceCandidate::ICE_NONE:
> +        tcp_type = "";

Redundant

@@ +1107,5 @@
> +    kNrIceTransportTcp);
> +  peer_->SetDNSResolver();
> +  Gather();
> +}
> +

Maybe we should have a gather test for both UDP and TCP?

@@ +1155,5 @@
>  
> +TEST_F(IceGatherTest, VerifyTestStunTcpServer) {
> +  UseFakeStunTcpServerWithResponse("192.0.2.233", 3333);
> +  Gather();
> +  ASSERT_TRUE(StreamHasMatchingCandidate(0, " 192.0.2.233 3333 "));

We should verify that the candidate is TCP.

@@ +1193,5 @@
> +  Gather(false);
> +  ASSERT_FALSE(StreamHasMatchingCandidate(0, "192.0.3.1"));
> +  TestStunTcpServer::GetInstance()->SetActive(true);
> +  WaitForGather();
> +  ASSERT_TRUE(StreamHasMatchingCandidate(0, "192.0.3.1"));

Verify that candidate is TCP

@@ +1196,5 @@
> +  WaitForGather();
> +  ASSERT_TRUE(StreamHasMatchingCandidate(0, "192.0.3.1"));
> +}
> +
> +

We probably want some sort of trickle test with both UDP and TCP.

@@ +1224,5 @@
> +  Connect();
> +}
> +
> +//TCP SO tests works on localhost only with delay applied:
> +//  tc qdisc add dev lo root netem delay 10ms

Why? Is there a way we can fix this?

@@ +1338,5 @@
> +  SendReceive();
> +}
> +
> +//TCP SO tests works on localhost only with delay applied:
> +//  tc qdisc add dev lo root netem delay 10ms

Why?

::: media/mtransport/test/multi_tcp_socket_unittest.cpp
@@ +3,5 @@
> +/* This Source Code Form is subject to the terms of the Mozilla Public
> + * License, v. 2.0. If a copy of the MPL was not distributed with this file,
> + * You can obtain one at http://mozilla.org/MPL/2.0/. */
> +
> +// Original author: ekr@rtfm.com

Are you sure about that?

@@ +38,5 @@
> +namespace {
> +
> +class MultiTcpSocketTest : public ::testing::Test {
> + public:
> +  MultiTcpSocketTest()

Does ::testing::Test disable copy c'tor and assignment operator already? If not, we should probably do so just in case.

@@ +52,5 @@
> +  }
> +
> +  static void SockReadable(NR_SOCKET s, int how, void *arg) {
> +    MultiTcpSocketTest *obj=static_cast<MultiTcpSocketTest *>(arg);
> +    obj->SetReadable(true);

This looks potentially racy (the so-called "benign" data race); it is probably easiest to use mozilla's implementation of Atomic<bool>

@@ +55,5 @@
> +    MultiTcpSocketTest *obj=static_cast<MultiTcpSocketTest *>(arg);
> +    obj->SetReadable(true);
> +  }
> +
> +  void Shutdown() {

Probably should call this Shutdown_s

@@ +64,5 @@
> +  }
> +
> +  void Create_s(nr_socket_tcp_type tcp_type, nr_socket **sock) {
> +    nr_transport_addr local;
> +    static unsigned short port_s=4000;

What happens if we hit a port that is in use?

@@ +77,5 @@
> +        &MultiTcpSocketTest::SockReadable, this);
> +    ASSERT_EQ(0, r);
> +  }
> +
> +  void Create(nr_socket *&sock, nr_socket_tcp_type tcp_type) {

Just pass a nr_socket* here.

@@ +126,5 @@
> +    ASSERT_EQ(0, r);
> +  }
> +
> +  void SendData(nr_socket *from, nr_socket *to, const char *data, size_t len) {
> +    SetReadable(false);

Why does SendData need to have this side-effect? Wouldn't it make a little more sense to put this at the end of RecvData?

@@ +133,5 @@
> +                this, &MultiTcpSocketTest::SendData_s, from, to, data, len),
> +            NS_DISPATCH_SYNC);
> +  }
> +
> +  void RecvData_s(nr_socket *from, nr_socket *to, const char *data,

Might be more readable to prefix from, data and len with "expected_", and rename data2 to something like "received_data".

@@ +144,5 @@
> +    int r=nr_socket_getaddr(to, &addr_to);
> +    ASSERT_EQ(0, r);
> +    r=nr_socket_getaddr(from, &addr_from);
> +    ASSERT_EQ(0, r);
> +    printf("Recieving %s <- %s\n", addr_to.as_string, addr_from.as_string);

NIT: Typo

@@ +145,5 @@
> +    ASSERT_EQ(0, r);
> +    r=nr_socket_getaddr(from, &addr_from);
> +    ASSERT_EQ(0, r);
> +    printf("Recieving %s <- %s\n", addr_to.as_string, addr_from.as_string);
> +    r=nr_socket_recvfrom(to, data2, len, &retlen, 0, &retaddr);

This will fail to detect cases where recvfrom would have handed us more data than expected. Might be the desired behavior.

@@ +167,5 @@
> +  void TransferData(nr_socket *from, nr_socket *to) {
> +    const char data[]="Test data 1234567890";
> +
> +    SendData(from, to, data, sizeof(data));
> +    RecvData(from, to, data, sizeof(data));

I don't see any place where we call SendData twice, and then RecvData twice, in that order. Probably worth testing somewhere.
Also, let's vary this data, so we don't miss stuff like replaying the same buffer.

@@ +171,5 @@
> +    RecvData(from, to, data, sizeof(data));
> +  }
> +
> + protected:
> +  bool IsReadable() {

const

@@ +197,5 @@
> +  Connect(socks[1], socks[0]);
> +  Connect(socks[2], socks[0]);
> +}
> +
> +TEST_F(MultiTcpSocketTest, TestTransferData1) {

Let's pick a more descriptive name, say TestListenerTransmits

@@ +202,5 @@
> +  Create(socks[0], TCP_TYPE_ACTIVE);
> +  Create(socks[1], TCP_TYPE_PASSIVE);
> +  Listen(socks[1]);
> +  Connect(socks[0], socks[1]);
> +//  PR_Sleep(PR_MillisecondsToInterval(200)); // wait for connect

Do we need this or not? Same question in a few other places.

@@ +207,5 @@
> +  TransferData(socks[1], socks[0]);
> +}
> +
> +
> +TEST_F(MultiTcpSocketTest, TestTransferData2) {

Need a better name, suggest TestTwoActiveBidirectionalTransmit

@@ +252,5 @@
> +int main(int argc, char **argv)
> +{
> +  test_utils = new MtransportTestUtils();
> +  NSS_NoDB_Init(nullptr);
> +  NSS_SetDomesticPolicy();

Is there a reason we're doing this in a non-TLS unit-test?

::: media/mtransport/test/stunserver.cpp
@@ +475,5 @@
> +  instance = nullptr;
> +}
> +
> +
> +int TestStunTcpServer::TryOpenListenSocket(nr_local_addr* addr, uint16_t port) {

A little more copy/paste than I like, might be useful to have a common function that fills out the nr_local_addr.

@@ +503,5 @@
> +
> +    return 0;
> +}
> +
> +TestStunTcpServer* TestStunTcpServer::Create() {

This is way too much copy/paste. Please refactor and add virtual functions as necessary to make this better.

::: media/mtransport/test/stunserver.h
@@ +60,4 @@
>  
> +  static void readable_cb(NR_SOCKET sock, int how, void *cb_arg);
> +
> + private:

Thanks for going through the effort to get some encapsulation here.

@@ +95,5 @@
>  
> +  static TestStunTcpServer* instance;
> +  static uint16_t instance_port;
> +};
> +};

Should be
} // End of namespace mozilla

::: media/mtransport/third_party/nICEr/src/ice/ice_candidate.c
@@ +180,5 @@
>          ABORT(R_BAD_ARGS);
>      }
> +
> +    if (tcp_type)
> +      strncat(label, nr_tcp_type_name(tcp_type), sizeof(label)-1);

Pretty sure there's no whitespace between this token and the stuff preceding it.

@@ +369,5 @@
>        case HOST:
>          if(r=NR_reg_get_uchar(NR_ICE_REG_PREF_TYPE_HOST,&type_preference))
>            ABORT(r);
> +        if(cand->base.protocol == IPPROTO_TCP){
> +          type_preference--;

Magic constant, only less readable. Prefer making this configurable, but I can live with it just being a constant. Same for the other, similar, cases.

@@ +370,5 @@
>          if(r=NR_reg_get_uchar(NR_ICE_REG_PREF_TYPE_HOST,&type_preference))
>            ABORT(r);
> +        if(cand->base.protocol == IPPROTO_TCP){
> +          type_preference--;
> +          if(cand->tcp_type == TCP_TYPE_ACTIVE){

Copy/paste. I think it makes sense to move the code that sets direction_priority out of the switch statement, since it will be a no-op for TCP_TYPE_NONE

@@ +397,5 @@
>            r_log(LOG_ICE,LOG_ERR,"Unknown protocol type %u",
>                  (unsigned int)cand->base.protocol);
>            ABORT(R_INTERNAL);
>          }
> +        stun_priority=31-cand->stun_server->index;

Hmm. If we are going to revise the maximum index for stun_servers, there is code in ice_ctx.c that needs to be modified (look for the string "255 is the max for our priority algorithm"). That said, I wonder if we're going to see services that want to use this many. If so, we could maybe steal a bit from the interface priority. I'm inclined to say we can cross that bridge when we reach it, but opinions may differ. (We could also do a better job of pruning duplicates from the list of servers)

@@ +415,5 @@
> +          else if(cand->tcp_type == TCP_TYPE_SO){
> +            direction_priority=6;
> +          }
> +        }
> +        stun_priority=31-cand->stun_server->index;

Redundant.

@@ +592,5 @@
>      if(r=nr_transport_addr_copy(&cand->stun_server_addr,addr))
>        ABORT(r);
>  
> +    if (cand->tcp_type == TCP_TYPE_PASSIVE || cand->tcp_type == TCP_TYPE_SO){
> +      if (r=nr_socket_multi_tcp_store_stun_server(cand->osock, addr))

It is not totally obvious that this causes a connection to be formed. Recommend renaming.

Also, it is not clear to me why we must call this only for stun servers that we needed to do a DNS lookup for; presumably sendto is doing the same work for servers with an IP address, why can't it do that for these too?

@@ +755,5 @@
>        cand->u.srvrflx.stun_handle=0;
>      }
>  
> +    if (cand->tcp_type){
> +      if ((r=nr_socket_multiple_tcp_remove_stun_server(cand->osock, &cand->stun_server_addr)))

We aren't supposed to be disconnecting here (see section 4.1, search for "Once the candidates have been obtained"). We're supposed to wait until ICE is complete.

Also, the naming of this function could make it more clear that we are disconnecting.

Lastly, if this function fails, I think it would be appropriate to assert instead of ABORT. Probably should keep that in mind when moving it.

::: media/mtransport/third_party/nICEr/src/ice/ice_candidate_pair.c
@@ +213,5 @@
>            return;
>          }
>          /* Fall through */
>        case NR_STUN_CLIENT_STATE_TIMED_OUT:
> +        r_log(LOG_ICE,LOG_DEBUG,"ICE-PEER(%s)/CAND-PAIR(%s): Timed out",pair->pctx->label,pair->codeword);

This isn't always true, given the fallthrough above.

@@ +263,5 @@
>  
>            /* OK, nothing found, must be peer reflexive */
>            if(!cand){
> +            nr_socket_tcp_type tcptype = pair->local->tcp_type;
> +            switch (tcptype){

This looks iffy to me.
-How would we ever get a response on a passive candidate pair?
-For SO, is there any reason to make this peer reflexive pair any type other than SO?
-Same question for active.

(The language in the base ICE spec says that the new peer reflexive candidate is paired only with the remote candidate from the original pair, meaning it would have to have the same type as the original)

::: media/mtransport/third_party/nICEr/src/ice/ice_component.c
@@ +305,5 @@
> +  {
> +    int r,_status;
> +    nr_ice_candidate *cand=0;
> +    char label[256];
> +    int i=3;

Call this "tries" or something.

@@ +309,5 @@
> +    int i=3;
> +
> +    do{
> +      if (!i--)
> +        ABORT(R_FAILED);

Wouldn't it make more sense to ABORT(r), since the value of r is why we tried again?

@@ +318,5 @@
> +
> +      if((r=nr_transport_addr_fmt_addr_string(addr)))
> +        ABORT(r);
> +
> +      r=nr_socket_multi_tcp_create(ctx->stun_servers,ctx->stun_server_ct,addr,

Looking at the implementation, there are lots of reasons other than "the port wasn't available" that it could fail, most of which should definitely cause us to ABORT immediately. I don't know whether it is easy to tell the difference, though.

@@ +319,5 @@
> +      if((r=nr_transport_addr_fmt_addr_string(addr)))
> +        ABORT(r);
> +
> +      r=nr_socket_multi_tcp_create(ctx->stun_servers,ctx->stun_server_ct,addr,
> +        tcp_type,so_sock_ct,1,NR_MAX_FRAME_SIZE<<1,nrsock);

It looks like a subsequent ABORT will leave the caller with the nrsock out param set. We probably want to assign this to a temp variable, and only assign to the out param once we know everything worked. We also need to clean up on failure.

@@ +325,5 @@
> +    } while(r);
> +
> +    if((tcp_type == TCP_TYPE_PASSIVE) && (r=nr_socket_listen(*nrsock,1)))
> +      ABORT(r);
> +    if((r=nr_ice_socket_create(ctx,component,*nrsock,NR_ICE_SOCKET_TYPE_STREAM_TCP,isock)))

Subsequent ABORT leaves caller with the isock out param set.

@@ +331,5 @@
> +    if((r=nr_ice_candidate_create(ctx,component,*isock,*nrsock,HOST,tcp_type,0,
> +      component->component_id,&cand)))
> +      ABORT(r);
> +
> +    TAILQ_INSERT_TAIL(&component->candidates,cand,entry_comp);

We probably want to do this insert last, in case something after fails.

@@ +334,5 @@
> +
> +    TAILQ_INSERT_TAIL(&component->candidates,cand,entry_comp);
> +    component->candidate_ct++;
> +
> +    /* Create a STUN server context for this socket */

There are now three basically identical copies of this code. Time to break it out into its own function.

@@ +401,5 @@
> +        TCP_TYPE_PASSIVE, 0, lufrag, pwd, &sock_psv, &isock_psv)))
> +        ABORT(r);
> +
> +      /* simultaneous-open host candidate */
> +      if ((r=nr_ice_component_create_tcp_host_candidate(ctx, component, &addr,

This call leaves the ephemeral port in addr to be used by nr_socket_local_create below for the TURN TCP candidates. Is this intended? Maybe we want to do do this stuff last?

@@ +405,5 @@
> +      if ((r=nr_ice_component_create_tcp_host_candidate(ctx, component, &addr,
> +        TCP_TYPE_SO, so_sock_ct, lufrag, pwd, &sock_so, &isock_so)))
> +        ABORT(r);
> +
> +      /* And srvrflx candidates for each STUN server */

Don't we also want to do this for each TURN server, similar to the logic in nr_ice_component_initialize_udp (grep for "both a srvrflx and relayed candidate")? Also, lots of code duplication here, maybe there is some way to reduce that doesn't involve macros.

@@ +480,5 @@
>            ABORT(r);
>  
>          STAILQ_INSERT_TAIL(&component->sockets,isock,entry);
>        }
> +#endif

Wow, this wasn't in the right place before. Good catch.

@@ +745,5 @@
>  
> +        /* Try to find matching peer active tcp candidate */
> +        peer_cand=TAILQ_FIRST(&comp->candidates);
> +        while(peer_cand){
> +          if(/*peer_cand->state == NR_ICE_CAND_PEER_CANDIDATE_UNPAIRED &&*/

This comment needs to go away, right?

@@ +760,5 @@
> +          if(r=nr_ice_peer_peer_rflx_candidate_create(comp->stream->pctx->ctx,"prflx",comp,&req->src_addr,&pcand)) {
> +            *error=(r==R_NO_MEMORY)?500:400;
> +            ABORT(r);
> +          }
> +          peer_cand=pcand;

The naming was a little unclear before, but now it is downright confusing. Perhaps rename "pcand" to "new_peer_cand"? Or maybe just use peer_cand and have an int created_peer_cand that determines whether we perform an insert (or destroy on abort)?

::: media/mtransport/third_party/nICEr/src/ice/ice_parser.c
@@ +177,5 @@
>      /* Protocol */
> +    if (!strncasecmp(str, "UDP", 3))
> +      transport=IPPROTO_UDP;
> +    else {
> +      if (!strncasecmp(str, "TCP", 3))

Just use "else if(...) {" here for readability.

@@ +330,5 @@
>  
>      skip_whitespace(&str);
>  
> +    /* Parse tcptype extension per RFC 6544 S 4.5 */
> +    if (transport == IPPROTO_TCP){

None of this is valid if we're talking about TCP relayed.

@@ +335,5 @@
> +      if (strncasecmp("tcptype", str, 7))
> +        ABORT(R_BAD_DATA);
> +
> +      fast_forward(&str, 7);
> +      if (*str == '\0')

skip_whitespace() does the right thing if we're on a null, so this is not necessary.

@@ +338,5 @@
> +      fast_forward(&str, 7);
> +      if (*str == '\0')
> +        ABORT(R_BAD_DATA);
> +
> +      skip_whitespace(&str);

Seems to me that "tcptypeso" would parse successfully here, but is probably not something we care about. Just be aware.

@@ +342,5 @@
> +      skip_whitespace(&str);
> +      if (*str == '\0')
> +          ABORT(R_BAD_DATA);
> +
> +      assert(nr_ice_candidate_tcp_type_names[0] == 0);

The code below doesn't depend on this being true.

@@ +344,5 @@
> +          ABORT(R_BAD_DATA);
> +
> +      assert(nr_ice_candidate_tcp_type_names[0] == 0);
> +      for (i = 1; nr_ice_candidate_tcp_type_names[i]; ++i) {
> +          if(!strncasecmp(nr_ice_candidate_tcp_type_names[i], str, strlen(nr_ice_candidate_tcp_type_names[i]))) {

Seems like "tcptype soylentgreenispeople" would parse successfully here. We almost certainly won't see extensions that add new tcptypes, so maybe not a big deal.

@@ +350,5 @@
> +              break;
> +          }
> +      }
> +
> +      if (nr_ice_candidate_tcp_type_names[i] == 0)

Instead of allowing the scope of 'i' to spill down here, why not just use cand->tcp_type == 0, and do the fast_forward before the break?

::: media/mtransport/third_party/nICEr/src/ice/ice_socket.c
@@ +225,5 @@
> +        ABORT(r);
> +      NR_ASYNC_WAIT(fd,NR_ASYNC_WAIT_READ,nr_ice_socket_readable_cb,sock);
> +    }
> +    else {
> +      if((r=nr_socket_multi_tcp_set_readable_cb(nsock,nr_ice_socket_readable_cb,sock)))

Would be nice to put a comment as to why nr_ice_socket_readable_cb isn't hooked up in the usual way.

::: media/mtransport/third_party/nICEr/src/net/nr_socket.c
@@ +48,5 @@
>      if(!(sock=RCALLOC(sizeof(nr_socket))))
>        ABORT(R_NO_MEMORY);
>  
> +    assert(vtbl->version >= 1 && vtbl->version <= 2);
> +    if (vtbl->version < 1 && vtbl->version > 2)

Gonna be pretty tough for both of these to be true.

::: media/mtransport/third_party/nICEr/src/net/nr_socket.h
@@ +50,5 @@
>  #elif defined(WIN32)
>  #define restrict __restrict
>  #endif
>  
> +typedef enum {TCP_TYPE_NONE=0, TCP_TYPE_ACTIVE, TCP_TYPE_PASSIVE, TCP_TYPE_SO, TCP_TYPE_MAX} nr_socket_tcp_type;

Please break up over multiple lines.

::: media/mtransport/third_party/nICEr/src/net/nr_socket_framed_tcp.c
@@ +1,1 @@
> +/*

This file is almost completely a copy/paste of nr_socket_buffered_stun.c. Please refactor this to reduce the amount of copied code. Also, Adobe was on the original license block, so it needs to stay for anything substantially similar. Apply the same logic to other files as needed.

::: media/mtransport/third_party/nICEr/src/net/nr_socket_multi_tcp.c
@@ +252,5 @@
> +
> +    return 0;
> +  }
> +
> +int nr_socket_multi_tcp_store_stun_server(nr_socket *sock,

Lots of overlap between this and nr_socket_multi_tcp_connect. Please refactor.
Attachment #8391222 - Flags: review?(docfaraday) → review-
Comment on attachment 8391222 [details] [diff] [review]
ICE-TCP patch

Review of attachment 8391222 [details] [diff] [review]:
-----------------------------------------------------------------

::: media/mtransport/nr_socket_prsock.cpp
@@ +1181,5 @@
>  static int nr_socket_local_write(void *obj,const void *msg, size_t len,
>                                   size_t *written);
>  static int nr_socket_local_read(void *obj,void * restrict buf, size_t maxlen,
>                                  size_t *len);
> +static int nr_socket_local_listen(void *obj, int backlog);

Could be, but backlog in PR_Listen and also in listen function is an int

::: media/mtransport/nricectx.h
@@ +107,3 @@
>  
>   protected:
> +  NrIceStunServer(const char *transport) : addr_(), transport_(transport) {}

Why? this way it is consistent with the original code.

::: media/mtransport/test/ice_unittest.cpp
@@ +1224,5 @@
> +  Connect();
> +}
> +
> +//TCP SO tests works on localhost only with delay applied:
> +//  tc qdisc add dev lo root netem delay 10ms

It's because connect to closed port on localhost ends up immediately with Connection refused error. The second connect has to be called before the SYN packet from the first connect is delivered.

::: media/mtransport/third_party/nICEr/src/ice/ice_candidate.c
@@ +592,5 @@
>      if(r=nr_transport_addr_copy(&cand->stun_server_addr,addr))
>        ABORT(r);
>  
> +    if (cand->tcp_type == TCP_TYPE_PASSIVE || cand->tcp_type == TCP_TYPE_SO){
> +      if (r=nr_socket_multi_tcp_store_stun_server(cand->osock, addr))

No, sendto is not doing that work, create does. It's because it pre-creates sockets for all stun servers and sets to not use framing mechanism on them. For STUN servers with IP addresses it can immediately connect, but for ones with host name, it has to wait to resolve it.
It can't be moved to sendto, because sendto doesn't know whether it sends to stun server (do not use framing) or to peer ICE candidate (use framing).

@@ +755,5 @@
>        cand->u.srvrflx.stun_handle=0;
>      }
>  
> +    if (cand->tcp_type){
> +      if ((r=nr_socket_multiple_tcp_remove_stun_server(cand->osock, &cand->stun_server_addr)))

I don't see a conflict here. Passive candidate's socket is still listening and SO candidate still has socket connected, only sockets used to get srflx from STUN are closed, when they are not needed anymore. 
Section 11.2, says that connections to STUN servers SHOULD be closed.

::: media/mtransport/third_party/nICEr/src/ice/ice_component.c
@@ +405,5 @@
> +      if ((r=nr_ice_component_create_tcp_host_candidate(ctx, component, &addr,
> +        TCP_TYPE_SO, so_sock_ct, lufrag, pwd, &sock_so, &isock_so)))
> +        ABORT(r);
> +
> +      /* And srvrflx candidates for each STUN server */

Yes, TURN server could be used to get srflx candidates derived from one of tcp host candidates (probably passive), but it's not trivial and I don't have access to any TURN server to check if I didn't break existing TURN functionality.
Attached patch ICE-TCP patch (obsolete) — — Splinter Review
Attachment #8391222 - Attachment is obsolete: true
Attachment #8397059 - Flags: review?(docfaraday)
(In reply to Peter Tatrai from comment #8)
> Comment on attachment 8391222 [details] [diff] [review]
> ICE-TCP patch
> 
> Review of attachment 8391222 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: media/mtransport/nr_socket_prsock.cpp
> @@ +1181,5 @@
> >  static int nr_socket_local_write(void *obj,const void *msg, size_t len,
> >                                   size_t *written);
> >  static int nr_socket_local_read(void *obj,void * restrict buf, size_t maxlen,
> >                                  size_t *len);
> > +static int nr_socket_local_listen(void *obj, int backlog);
> 
> Could be, but backlog in PR_Listen and also in listen function is an int
> 
> ::: media/mtransport/nricectx.h
> @@ +107,3 @@
> >  
> >   protected:
> > +  NrIceStunServer(const char *transport) : addr_(), transport_(transport) {}
> 
> Why? this way it is consistent with the original code.
>

   Then the original code probably should be modified too. I can look into that.

> ::: media/mtransport/test/ice_unittest.cpp
> @@ +1224,5 @@
> > +  Connect();
> > +}
> > +
> > +//TCP SO tests works on localhost only with delay applied:
> > +//  tc qdisc add dev lo root netem delay 10ms
> 
> It's because connect to closed port on localhost ends up immediately with
> Connection refused error. The second connect has to be called before the SYN
> packet from the first connect is delivered.
>

   Is this going to be racy in practice, I wonder?

> ::: media/mtransport/third_party/nICEr/src/ice/ice_candidate.c
> @@ +592,5 @@
> >      if(r=nr_transport_addr_copy(&cand->stun_server_addr,addr))
> >        ABORT(r);
> >  
> > +    if (cand->tcp_type == TCP_TYPE_PASSIVE || cand->tcp_type == TCP_TYPE_SO){
> > +      if (r=nr_socket_multi_tcp_store_stun_server(cand->osock, addr))
> 
> No, sendto is not doing that work, create does. It's because it pre-creates
> sockets for all stun servers and sets to not use framing mechanism on them.
> For STUN servers with IP addresses it can immediately connect, but for ones
> with host name, it has to wait to resolve it.
> It can't be moved to sendto, because sendto doesn't know whether it sends to
> stun server (do not use framing) or to peer ICE candidate (use framing).
> 

   Ok, gotcha.

> @@ +755,5 @@
> >        cand->u.srvrflx.stun_handle=0;
> >      }
> >  
> > +    if (cand->tcp_type){
> > +      if ((r=nr_socket_multiple_tcp_remove_stun_server(cand->osock, &cand->stun_server_addr)))
> 
> I don't see a conflict here. Passive candidate's socket is still listening
> and SO candidate still has socket connected, only sockets used to get srflx
> from STUN are closed, when they are not needed anymore. 
> Section 11.2, says that connections to STUN servers SHOULD be closed.
> 

   11.2 specifies behavior once ICE is completed, but that hasn't happened yet here; we've only just gathered a candidate. There's no way the other side knows about this yet, and we're already closing the connection. This is likely to result in the NAT destroying the port mapping, correct? (Also, what is the NAT going to do with the port mapping to our peer when it sees us close the connection to the STUN server from the same port?)

> ::: media/mtransport/third_party/nICEr/src/ice/ice_component.c
> @@ +405,5 @@
> > +      if ((r=nr_ice_component_create_tcp_host_candidate(ctx, component, &addr,
> > +        TCP_TYPE_SO, so_sock_ct, lufrag, pwd, &sock_so, &isock_so)))
> > +        ABORT(r);
> > +
> > +      /* And srvrflx candidates for each STUN server */
> 
> Yes, TURN server could be used to get srflx candidates derived from one of
> tcp host candidates (probably passive), but it's not trivial and I don't
> have access to any TURN server to check if I didn't break existing TURN
> functionality.

   You can get some temporary TURN credentials from apprtc like this, and set them using the TURN_SERVER_USER, TURN_SERVER_PASSWORD, and TURN_SERVER_ADDRESS (this takes only the IP address/hostname, no port) environment variables on ice_unittest:

curl -H 'origin:https://apprtc.appspot.com' -H 'user-agent:Mozilla/5.0' "https://computeengineondemand.appspot.com/turn?username=74912086&key=4080218913"
(In reply to Byron Campen [:bwc] from comment #10)

> > ::: media/mtransport/test/ice_unittest.cpp
> > @@ +1224,5 @@
> > > +  Connect();
> > > +}
> > > +
> > > +//TCP SO tests works on localhost only with delay applied:
> > > +//  tc qdisc add dev lo root netem delay 10ms
> > 
> > It's because connect to closed port on localhost ends up immediately with
> > Connection refused error. The second connect has to be called before the SYN
> > packet from the first connect is delivered.
> >
> 
>    Is this going to be racy in practice, I wonder?
> 
Yes.

> > @@ +755,5 @@
> > >        cand->u.srvrflx.stun_handle=0;
> > >      }
> > >  
> > > +    if (cand->tcp_type){
> > > +      if ((r=nr_socket_multiple_tcp_remove_stun_server(cand->osock, &cand->stun_server_addr)))
> > 
> > I don't see a conflict here. Passive candidate's socket is still listening
> > and SO candidate still has socket connected, only sockets used to get srflx
> > from STUN are closed, when they are not needed anymore. 
> > Section 11.2, says that connections to STUN servers SHOULD be closed.
> > 
> 
>    11.2 specifies behavior once ICE is completed, but that hasn't happened
> yet here; we've only just gathered a candidate. There's no way the other
> side knows about this yet, and we're already closing the connection. This is
> likely to result in the NAT destroying the port mapping, correct? (Also,
> what is the NAT going to do with the port mapping to our peer when it sees
> us close the connection to the STUN server from the same port?)
> 

You are right. I'm removing this. It will close STUN server connections together with the others in candidate's destroy function.
Attached patch ICE-TCP patch (obsolete) — — Splinter Review
- removed closing of STUN server connection in stun_finished_cb
- modified TURN TCP according review comment
Attachment #8397059 - Attachment is obsolete: true
Attachment #8397059 - Flags: review?(docfaraday)
Attachment #8397818 - Flags: review?(docfaraday)
Comment on attachment 8397818 [details] [diff] [review]
ICE-TCP patch

Review of attachment 8397818 [details] [diff] [review]:
-----------------------------------------------------------------

This doesn't compile:

/home/bcampen/checkouts/mozilla-central/media/mtransport/third_party/nICEr/src/net/nr_socket_multi_tcp.c:166:7: error: 'continue' statement not in loop statement
Attachment #8397818 - Flags: review?(docfaraday) → review-
Attached patch ICE TCP patch (obsolete) — — Splinter Review
This one compiles.
Attachment #8397818 - Attachment is obsolete: true
Attachment #8398368 - Flags: review?(docfaraday)
Just making it clear that I am looking at this, I'll probably be done by tomorrow.
Comment on attachment 8398368 [details] [diff] [review]
ICE TCP patch

Review of attachment 8398368 [details] [diff] [review]:
-----------------------------------------------------------------

This is a lot closer to ready, although there are still some things that I don't think will work correctly, particularly regarding the handling of pre-allocated TCP sockets. Most of the refactoring I asked for in the previous review is good, although it needs some fixup.

Also, there is another compile error, but it is not your fault. I am uploading a patch to handle that issue in a moment (verified that it works on linux, have a try run to check other platforms).

Unfortunately, someone has regressed signaling_unittests, so I can't tell if that test works with your changes. The other unit-tests seem fine.

::: media/mtransport/nr_socket_prsock.cpp
@@ +408,5 @@
>  
>    PRStatus status;
>    PRNetAddr naddr;
>  
> +  PRSocketOptionData opt_nonblock;

Since this is c++ code, there is nothing forcing us to put all of the variable declarations up top. The best practice is generally to put variables where they are used.

@@ +434,5 @@
>        if (!(fd_ = PR_NewTCPSocket())) {
>          r_log(LOG_GENERIC,LOG_CRIT,"Couldn't create socket");
>          ABORT(R_INTERNAL);
>        }
> +      // Set ReuseAddr for TCP sockets to enable having several

I wonder if we should make this configurable? How likely is it that this option will fail? If it is likely, we probably don't want it to cause TURN TCP failures where it isn't needed. Security-wise, I guess there probably isn't any harm done, anything that a malicious process can do to mess us up it could probably already do externally...

@@ +473,5 @@
>    }
>  
>  
>    // Set nonblocking
> +  opt_nonblock.option = PR_SockOpt_Nonblocking;

Just leave the opt_nonblock variable declaration down here.

@@ +636,5 @@
>    status = PR_Write(fd_, msg, len);
>    if (status < 0) {
>      if (PR_GetError() == PR_WOULD_BLOCK_ERROR)
>        ABORT(R_WOULDBLOCK);
> +    r_log(LOG_GENERIC, LOG_INFO, "Error in write");

Why INFO here but ERR for recvfrom?

@@ +659,5 @@
>    status = PR_Read(fd_, buf, maxlen);
>    if (status < 0) {
>      if (PR_GetError() == PR_WOULD_BLOCK_ERROR)
>        ABORT(R_WOULDBLOCK);
> +    r_log(LOG_GENERIC, LOG_INFO, "Error in read");

Similar question.

@@ +932,5 @@
>    if (state_ != NR_INIT) {
>      ABORT(R_INTERNAL);
>    }
>  
> +  if (addr->protocol != IPPROTO_UDP)

Maybe MOZ_ASSERT here instead, since the value will just be ignored otherwise (ie; will not interfere with operation, just won't do what the developer expected)? If we want to ABORT still, R_BAD_ARGS is probably what we want.

@@ +1099,5 @@
> +  MOZ_ASSERT(false);
> +  return R_INTERNAL;
> +}
> +
> +int NrSocketIpc::accept( nr_transport_addr *addrp, nr_socket **sockp) {

I hate to complain about whitespace, but if I don't someone will give me grief.

::: media/mtransport/nr_socket_prsock.h
@@ +130,5 @@
>                   public nsASocketHandler {
>  public:
>    NrSocket() : fd_(nullptr) {}
>    virtual ~NrSocket() {
> +    if (fd_)

Good catch.

::: media/mtransport/nricectx.h
@@ +107,3 @@
>  
>   protected:
> +  NrIceStunServer(const char *transport) : addr_(), transport_(transport) {}

So, the original code did not have an argument, now it is a single-arg constructor. A good rule of thumb is that single-arg c'tors should be declared explicit to avoid promiscuous type conversion, unless this is something you actually need.

::: media/mtransport/test/multi_tcp_socket_unittest.cpp
@@ +135,5 @@
> +                this, &MultiTcpSocketTest::SendData_s, from, to, data, len),
> +            NS_DISPATCH_SYNC);
> +  }
> +
> +  void RecvData_s(nr_socket *expected_from, nr_socket *expected_to,

We don't have any expectations on expected_to.

::: media/mtransport/test/stunserver.cpp
@@ +316,2 @@
>  
>    r = nr_stun_find_local_addresses(addrs, 100, &addr_ct);

Looks like this stuff can go away now, since Initialize handles this, right? The only thing we need is the getfd call now, I think.

::: media/mtransport/third_party/nICEr/src/ice/ice_candidate.c
@@ +180,5 @@
>          ABORT(R_BAD_ARGS);
>      }
> +
> +    if (tcp_type) {
> +      size_t slen=strlen(label)+1; /* plus space going to be added*/

I'd be perfectly happy with two calls to strncat, but this works too.

@@ +445,5 @@
> +            direction_priority=2;
> +          else
> +            direction_priority=6;
> +          break;
> +      }

We probably need a default (no-op) case if only to keep the compiler from complaining.

::: media/mtransport/third_party/nICEr/src/ice/ice_component.c
@@ +339,5 @@
> +
> +      if((r=nr_transport_addr_fmt_addr_string(&addr)))
> +        ABORT(r);
> +
> +      r=nr_socket_multi_tcp_create(ctx->stun_servers,ctx->stun_server_ct,

I looked into it, and it is going to require significant work to determine whether this is because the port was in use, or if there was some other failure. Just leave a comment explaining this caveat.

@@ +351,5 @@
> +
> +    if((r=nr_ice_socket_create(ctx,component,nrsock,NR_ICE_SOCKET_TYPE_STREAM_TCP,&isock_tmp)))
> +      ABORT(r);
> +
> +    /* nr_ice_socket took ownership of nrsock_tmp */

s/nrsock_tmp/nrsock/

@@ +373,5 @@
> +
> +    _status=0;
> +abort:
> +    if (_status) {
> +      nr_ice_candidate_destroy(&cand);

I _think_ this will be ok, since the ice_socket is destroyed too, but I worry that since this is the only place where we destroy a candidate we just created, it is likely that someone might change nr_ice_candidate in a way that makes this unsafe. It looks safe to create the stun server context, and then the candidate (last).

@@ +397,3 @@
>      for(i=0;i<addr_ct;i++){
>        char suppress;
> +      nr_ice_socket *isock_psv=0;

Ok, I like this. Much more clear to me what is going on.

@@ +462,4 @@
>  
> +        /* srvrflx */
> +        if(r=nr_ice_candidate_create(ctx,component,
> +            isock_psv,isock_psv->sock,SERVER_REFLEXIVE,TCP_TYPE_PASSIVE,

Hmm. I'm not sure we want to be using the same socket for both TURN TCP and passive ICE TCP. I think that we probably just want to establish server reflexive ICE TCP candidates on the TURN servers exactly like we do for the STUN server loop above, and keep that logic isolated from TURN TCP.

@@ +751,5 @@
>  
> +        /* Try to find matching peer active tcp candidate */
> +        pcand=TAILQ_FIRST(&comp->candidates);
> +        while(pcand){
> +          if(/*peer_cand->state == NR_ICE_CAND_PEER_CANDIDATE_UNPAIRED &&*/

Does this comment need to go?

@@ +790,5 @@
>            ABORT(r);
>          }
>  
>          /* Do this last, since any call to ABORT will destroy pcand */
> +        if (pcand && new_pcand_created){

I don't think we need to check pcand, since we dereference above.

::: media/mtransport/third_party/nICEr/src/ice/ice_parser.c
@@ +328,5 @@
>  
>      skip_whitespace(&str);
>  
> +    /* Parse tcptype extension per RFC 6544 S 4.5 */
> +    if (!strncasecmp("tcptype ", str, 8)) {

Would this successfully parse a non-relayed TCP candidate without a tcptype? (ie; should we be doing something like "if (transport == IPPROTO_TCP && cand->type != RELAYED)"?)

@@ +333,5 @@
> +      fast_forward(&str, 8);
> +
> +      skip_whitespace(&str);
> +      if (*str == '\0')
> +          ABORT(R_BAD_DATA);

Do we need this check? (won't we end up with cand->tcp_type == 0, and fail later on anyway?)

::: media/mtransport/third_party/nICEr/src/net/nr_socket_multi_tcp.c
@@ +58,5 @@
> +    if (!objp || !*objp)
> +      return(0);
> +
> +    nr_socket_destroy(&(*objp)->inner);
> +    RFREE(*objp);

Let's assert(!destroy_timer) in here somewhere (or maybe even make this a no-op if that happens?)

@@ +78,5 @@
> +    nr_tcp_socket_ctx *sock = 0;
> +    nr_socket *tcpsock;
> +
> +    if (!(sock = RCALLOC(sizeof(nr_tcp_socket_ctx))))
> +      ABORT(R_NO_MEMORY);

If we are taking ownership of nrsock, we should destroy it if this happens.

@@ +88,5 @@
> +
> +    sock->inner=tcpsock;
> +    sock->is_framed=is_framed;
> +
> +    if ((r=nr_ip4_port_to_transport_addr(INADDR_ANY, 0, IPPROTO_TCP, &sock->remote_addr)))

It probably doesn't matter for INADDR_ANY on pretty much any platform, but this function takes host byte order; it is probably a good idea to use ntohl just to make this clear.

@@ +167,5 @@
> +
> +    if ((r=nr_socket_local_create(addr, &nrsock)))
> +      ABORT(r);
> +
> +    if ((r=nr_tcp_socket_ctx_create(nrsock, 0, max_pending, &tcp_socket_ctx))){

Probably useful to comment that this takes ownership of nrsock whether it fails or not.

@@ +174,5 @@
> +
> +    TAILQ_INSERT_TAIL(&sock->sockets, tcp_socket_ctx, entry);
> +
> +    if (stun_server->type == NR_ICE_STUN_SERVER_TYPE_ADDR){
> +      nr_transport_addr addr;

Shadowing, please use a different (ideally more descriptive) name.

@@ +234,5 @@
> +      for (i=0; i<precreated_so_count; ++i) {
> +
> +        if ((r=nr_socket_local_create(addr, &nrsock)))
> +          ABORT(r);
> +        if ((r=nr_tcp_socket_ctx_create(nrsock, use_framing, max_pending, &tcp_socket_ctx))){

Might be useful to comment that this takes ownership of nrsock even when errors occur.

@@ +247,5 @@
> +
> +    _status=0;
> +  abort:
> +    if (_status) {
> +      void *sock_v = sock;

Why do we need this temp variable?

@@ +278,5 @@
> +    TAILQ_FOREACH(tcp_sock_ctx, &sock->sockets, entry) {
> +      if (stun_only && tcp_sock_ctx->is_framed)
> +        continue;
> +
> +      if (!nr_transport_addr_is_wildcard(&tcp_sock_ctx->remote_addr)) {

What happens if we have a STUN server with an FQDN, followed by a STUN server with an IP address? Don't we end up with INADDR_ANY, followed by a initted IP address? Will a subsequent ...get_sock_connected_to for the STUN server with the IP address (as a result of a sendto) lay claim to the INADDR_ANY socket reserved for the one with the FQDN? I don't think we can assume that INADDR_ANY always comes last, so we probably need to do two passes. And even then, we have to be very careful what we lay claim to; for instance, we don't want a stun-only socket reserved for a STUN server with FQDN to be claimed for some other purpose, leaving it with no stun-only sockets to work with. Similarly, I see a high probability of something that expects to get a framed socket getting a non-framed one. We probably want to have three modes of behavior: 

1. We are looking only for framed, and will connect if necessary, and maybe even create new sockets in the ACTIVE case (this is nr_socket_multi_tcp_connect)
2. We are looking only for non-framed, and will connect if necessary (this is nr_socket_multi_tcp_stun_server_connect)
3. We don't know whether we're looking for framed, but we assume that we're already connected and will not try to lay claim to an INADDR_ANY (this is nr_socket_multi_tcp_sendto)

@@ +306,5 @@
> +
> +    if ((r=nr_socket_local_create(&sock->addr, &nrsock)))
> +      ABORT(r);
> +
> +    if ((r=nr_tcp_socket_ctx_create(nrsock, sock->use_framing, sock->max_pending, &tcp_sock_ctx))){

Should we ever be willing to create a non-framed socket for an ACTIVE? This feels underconstrained.

@@ +336,5 @@
> +    nr_socket_multi_tcp *mtcp_sock = (nr_socket_multi_tcp *)sock->obj;
> +    nr_socket *nrsock;
> +
> +    if (mtcp_sock->tcp_type == TCP_TYPE_ACTIVE)
> +      ABORT(R_INTERNAL);

This feels like a good place to assert.

@@ +418,5 @@
> +
> +      if (r!=R_WOULDBLOCK) {
> +        NR_SOCKET fd;
> +
> +        nr_socket_close(tcpsock->inner);

It seems like we should do this after we cancel our read/write callbacks, and I'm pretty sure destroying the socket will do this work too.

@@ +426,5 @@
> +          NR_ASYNC_CANCEL(fd, NR_ASYNC_WAIT_WRITE);
> +        }
> +
> +        TAILQ_REMOVE(&sock->sockets, tcpsock, entry);
> +        NR_ASYNC_TIMER_SET(0,nr_tcp_socket_ctx_destroy_cb, tcpsock, &tcpsock->destroy_timer);

Why do this async? If it is unnecessary, we can remove a lot of code. If you like, I can try making this modification and running it under asan.

@@ +499,5 @@
> +    /* accept and add to socket_list */
> +    if (nr_socket_accept(sock->listen_socket, &remote_addr, &newsock))
> +      return;
> +
> +    if ((r=nr_tcp_socket_ctx_create(newsock, sock->use_framing, sock->max_pending, &tcp_sock_ctx)))

Might want to comment somewhere that use_framing is only 0 in test code (for the listen socket on the test STUN TCP server).

@@ +504,5 @@
> +      return;
> +
> +    /* connect will return error, but it is used to set remote address in
> +       nr_socket_buffered_stun */
> +    nr_socket_connect(tcp_sock_ctx->inner, &remote_addr);

This is kinda hacky, and I'm not 100% sure that this will either fail, or fail in the way we want/expect. It is probably better to add a param to nr_tcp_socket_ctx_create and init directly in there.

::: media/mtransport/third_party/nICEr/src/stun/nr_socket_buffered_stun.c
@@ +69,5 @@
>    nr_p_buf_ctx *p_bufs;
>    nr_p_buf_head pending_writes;
>    size_t pending;
>    size_t max_pending;
> +  int    use_framing;

I feel like the code would be more clear to someone unfamiliar with this stuff if this was "framing_type", with possible values TURN_TCP_FRAMING and ICE_TCP_FRAMING; everything this class handles is framed in some way.

@@ +115,5 @@
>    if ((r=nr_ip4_port_to_transport_addr(INADDR_ANY, 0, IPPROTO_UDP, &sock->remote_addr)))
>      ABORT(r);
>  
>    /* TODO(ekr@rtfm.com): Check this */
> +  if (!(sock->buffer = RMALLOC(frame_size + NR_STUN_MAX_MESSAGE_SIZE)))

Shouldn't this be sizeof(nr_frame_header) + NR_MAX_FRAME_SIZE in the ICE TCP framing case, since that's the only real limit on the size of buffers containing media?

@@ +186,5 @@
>    }
>  
> +  if (sock->use_framing) {
> +    if (len>NR_MAX_FRAME_SIZE)
> +      ABORT(R_FAILED);

This is probably worth an assert, also.

@@ +246,3 @@
>    if (sock->read_state == NR_ICE_SOCKET_READ_NONE) {
> +    if (sock->use_framing) {
> +      sock->bytes_needed = frame->frame_length = ntohs(frame->frame_length);

Why are we assigning the host byte order back to frame_length?

@@ +316,5 @@
>    nr_socket_buffered_stun *sock = (nr_socket_buffered_stun *)obj;
> +  NR_SOCKET fd;
> +
> +  /* Cancel waiting on the socket */
> +  if (sock->inner && !nr_socket_getfd(sock->inner, &fd)) {

Good catch.

@@ +429,5 @@
>    assert(!sock->pending);
>    _status=0;
>  abort:
>    if (_status && _status != R_WOULDBLOCK) {
> +    nr_socket_buffered_stun_failed(sock);

I wonder why this was a TODO...

::: media/mtransport/third_party/nICEr/src/stun/stun_client_ctx.c
@@ +701,5 @@
>              ABORT(R_BAD_DATA);
>  
> +        // STUN doesn't distinguish protocol in mapped address, therefore
> +        // assign used protocol from peer_addr
> +        if (mapped_addr->protocol!=peer_addr->protocol){

Good catch.

::: media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.cpp
@@ +687,5 @@
>          return NS_ERROR_FAILURE;
>        }
>      } else {
> +      if (!aDst->addStunServer(host.get(), port, transport.IsEmpty() ?
> +                                kNrIceTransportUdp : transport.get())) {

Wrap this ternary in parens.

::: media/webrtc/signaling/test/signaling_unittests.cpp
@@ +692,1 @@
>  

The v2 patch had a TCP stun server here. Why was it removed?

@@ +700,1 @@
>  

Same question as above.
Attachment #8398368 - Flags: review?(docfaraday) → review-
Attached patch enable_signaling_unittests.patch (obsolete) — — Splinter Review
(In reply to Byron Campen [:bwc] from comment #16)
> Unfortunately, someone has regressed signaling_unittests, so I can't tell if
> that test works with your changes. The other unit-tests seem fine.
> 
I'm using this temporary patch to enable running of signaling_unittests
Comment on attachment 8398368 [details] [diff] [review]
ICE TCP patch

Review of attachment 8398368 [details] [diff] [review]:
-----------------------------------------------------------------

::: media/mtransport/nr_socket_prsock.cpp
@@ +434,5 @@
>        if (!(fd_ = PR_NewTCPSocket())) {
>          r_log(LOG_GENERIC,LOG_CRIT,"Couldn't create socket");
>          ABORT(R_INTERNAL);
>        }
> +      // Set ReuseAddr for TCP sockets to enable having several

Personally, I think, it's not likely to fail, and I didn't want to change socket API if it is not necessary.

@@ +567,5 @@
>    status = PR_RecvFrom(fd_, buf, maxlen, flags, &nfrom, PR_INTERVAL_NO_WAIT);
>    if (status <= 0) {
> +    if (PR_GetError() == PR_WOULD_BLOCK_ERROR)
> +      ABORT(R_WOULDBLOCK);
> +    r_log(LOG_GENERIC, LOG_ERR, "Error in recvfrom");

Use INFO level as it is in sendto.

@@ +636,5 @@
>    status = PR_Write(fd_, msg, len);
>    if (status < 0) {
>      if (PR_GetError() == PR_WOULD_BLOCK_ERROR)
>        ABORT(R_WOULDBLOCK);
> +    r_log(LOG_GENERIC, LOG_INFO, "Error in write");

There is INFO level used in sendto. To be consistent with original code INFO should be used also in recvfrom.

::: media/mtransport/third_party/nICEr/src/ice/ice_component.c
@@ +418,5 @@
> +        TCP_TYPE_PASSIVE, 0, lufrag, pwd, &isock_psv)))
> +        ABORT(r);
> +
> +      /* simultaneous-open host candidate */
> +      if ((r=nr_ice_component_create_tcp_host_candidate(ctx, component, &addrs[i].addr,

Create SO candidate only when so_sock_ct>0

@@ +462,4 @@
>  
> +        /* srvrflx */
> +        if(r=nr_ice_candidate_create(ctx,component,
> +            isock_psv,isock_psv->sock,SERVER_REFLEXIVE,TCP_TYPE_PASSIVE,

I don't understand. Do you want to create srflx candidate not based on any host candidate?  It would require to use nr_socket_multi_tcp, as it is now, to be able to accept connection from peer candidates. Therefore I don't think that TURN TCP will be more isolated as it is now.
To isolate that, it is better to return to previous version of the patch. But there, the srflx candidate is not and cannot be created.

::: media/mtransport/third_party/nICEr/src/net/nr_socket_multi_tcp.c
@@ +247,5 @@
> +
> +    _status=0;
> +  abort:
> +    if (_status) {
> +      void *sock_v = sock;

Because nr_socket_multi_tcp_destroy accepts void** as argument. It's retyped in a same way in nr_socket_buffered_stun.
(In reply to Byron Campen [:bwc] from comment #16)
> @@ +434,5 @@
> >        if (!(fd_ = PR_NewTCPSocket())) {
> >          r_log(LOG_GENERIC,LOG_CRIT,"Couldn't create socket");
> >          ABORT(R_INTERNAL);
> >        }
> > +      // Set ReuseAddr for TCP sockets to enable having several
> 
> I wonder if we should make this configurable? How likely is it that this
> option will fail? If it is likely, we probably don't want it to cause TURN
> TCP failures where it isn't needed. Security-wise, I guess there probably
> isn't any harm done, anything that a malicious process can do to mess us up
> it could probably already do externally...
Personally, I think, it's not likely to fail, and I didn't want to change socket API if it is not necessary.

 
> Why INFO here but ERR for recvfrom?
> 
There is INFO level used in sendto. To be consistent with original code INFO should be used also in recvfrom.

> @@ +462,4 @@
> >  
> > +        /* srvrflx */
> > +        if(r=nr_ice_candidate_create(ctx,component,
> > +            isock_psv,isock_psv->sock,SERVER_REFLEXIVE,TCP_TYPE_PASSIVE,
> 
> Hmm. I'm not sure we want to be using the same socket for both TURN TCP and
> passive ICE TCP. I think that we probably just want to establish server
> reflexive ICE TCP candidates on the TURN servers exactly like we do for the
> STUN server loop above, and keep that logic isolated from TURN TCP.
> 
I don't understand. Do you want to create srflx candidate not based on any host candidate?  It would require to use nr_socket_multi_tcp, as it is now, to be able to accept connection from peer candidates. Therefore I don't think that TURN TCP will be more isolated as it is now.
To isolate that, it is better to return to previous version of the patch. But there, the srflx candidate is not and cannot be created.

> @@ +247,5 @@
> > +
> > +    _status=0;
> > +  abort:
> > +    if (_status) {
> > +      void *sock_v = sock;
> 
> Why do we need this temp variable?
> 
Because nr_socket_multi_tcp_destroy accepts void** as argument. It's retyped in a same way in nr_socket_buffered_stun.
Attached patch ICE TCP part 1 patch (obsolete) — — Splinter Review
Changes against ICE TCP patch applied. Tell me if you prefer it as a one merged patch.
(In reply to Peter Tatrai from comment #21)
> Created attachment 8401357 [details] [diff] [review]
> ICE TCP part 1 patch
> 
> Changes against ICE TCP patch applied. Tell me if you prefer it as a one
> merged patch.

Having the interdiff will help a lot, thanks. Eventually we'll need to land it as one patch; if you want to go ahead and fold everything up into one, request review, and mark your interdiff as not for checkin (just put something like "interdiff: not for checkin" in the commit message), I can try to give it a look today. I have a lot on my plate though, so it may be next week before I get to it.
Comment on attachment 8398368 [details] [diff] [review]
ICE TCP patch

Review of attachment 8398368 [details] [diff] [review]:
-----------------------------------------------------------------

::: media/mtransport/nr_socket_prsock.cpp
@@ +434,5 @@
>        if (!(fd_ = PR_NewTCPSocket())) {
>          r_log(LOG_GENERIC,LOG_CRIT,"Couldn't create socket");
>          ABORT(R_INTERNAL);
>        }
> +      // Set ReuseAddr for TCP sockets to enable having several

Upon further reflection, I think the main thing that might bite us is test code; the pattern we use for our test stun/turn/what-have-you servers is we try opening the standard port, but if it is in use, we try the next one. I could see funny things happening in tests that require operating two distinct test servers, for example (because they'd bind to the same port). I don't think you'd have to change the API to make this optional; there could be a NrSocket::create_reuseaddr function and a separate vtable, for instance (although this is still pretty heavy for something as simple as setting a sockopt).

::: media/mtransport/third_party/nICEr/src/ice/ice_component.c
@@ +418,5 @@
> +        TCP_TYPE_PASSIVE, 0, lufrag, pwd, &isock_psv)))
> +        ABORT(r);
> +
> +      /* simultaneous-open host candidate */
> +      if ((r=nr_ice_component_create_tcp_host_candidate(ctx, component, &addrs[i].addr,

Sounds reasonable to me.

@@ +462,4 @@
>  
> +        /* srvrflx */
> +        if(r=nr_ice_candidate_create(ctx,component,
> +            isock_psv,isock_psv->sock,SERVER_REFLEXIVE,TCP_TYPE_PASSIVE,

So, what this code does right now is for each TURN server, we create a server reflexive (passive type), chain it to a relayed (TURN TCP), and get both the mapped addr for the reflexive and the allocation for TURN at the same time. This is an optimization that makes sense for UDP, but I am less convinced that it makes sense in the TCP case. I think that in the TCP case, what we want to be doing is, for each TURN server:

1) Go through exactly the same logic as we did for each STUN server; basically we just treat each TURN server as an additional STUN server.
2) Completely separately, set up a TURN TCP candidate that is not chained to any server reflexive.

::: media/mtransport/third_party/nICEr/src/net/nr_socket_multi_tcp.c
@@ +247,5 @@
> +
> +    _status=0;
> +  abort:
> +    if (_status) {
> +      void *sock_v = sock;

Passing just &sock seems to work fine for me.
Also, I've noticed some new compiler warnings; one about the lack of a virtual d'tor in TestStunServer, and another about how we're forward typedefing nr_socket (it seems that the way out of this is to forward declare the struct before the forward typedef, and just define the struct without any typedef stuff around it; eg struct nr_socket_; typedef struct nr_socket_ nr_socket; ... struct nr_socket { ... };)
Here are the new warnings:

> ../../../../../../media/mtransport/third_party/nICEr/src/ice/ice_candidate.c:386:17: warning: variable 'type_preference' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
> ../../../../../../media/mtransport/third_party/nICEr/src/ice/ice_candidate.c:397:17: warning: variable 'type_preference' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
> ../../../../../../media/mtransport/third_party/nICEr/src/ice/ice_candidate.c:419:19: warning: variable 'type_preference' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
> ../../../../../../media/mtransport/third_party/nICEr/src/ice/ice_candidate.c:430:15: warning: enumeration values 'TCP_TYPE_NONE' and 'TCP_TYPE_MAX' not handled in switch [-Wswitch]
> ../../../../../../media/mtransport/third_party/nICEr/src/ice/ice_candidate.c:767:17: warning: unused variable 'r' [-Wunused-variable]
7a13
> ../../../../../../media/mtransport/third_party/nICEr/src/net/nr_socket.h:88:3: warning: redefinition of typedef 'nr_socket' is a C11 feature [-Wtypedef-redefinition]
42a49,50
> ../../../../../media/mtransport/test/stunserver.cpp:353:3: warning: delete called on 'mozilla::TestStunServer' that has virtual functions but non-virtual destructor [-Wdelete-non-virtual-dtor]
> ../../../../../media/mtransport/test/stunserver.cpp:504:3: warning: delete called on 'mozilla::TestStunTcpServer' that has virtual functions but non-virtual destructor [-Wdelete-non-virtual-dtor]
102a111,114
> ../../../../dist/include/mozilla/Scoped.h:244:35: warning: delete called on 'mozilla::TestStunServer' that has virtual functions but non-virtual destructor [-Wdelete-non-virtual-dtor]
> ../../../../dist/include/mozilla/Scoped.h:244:35: warning: delete called on 'mozilla::TestStunTcpServer' that has virtual functions but non-virtual destructor [-Wdelete-non-virtual-dtor]
> ../../../../media/mtransport/test/stunserver.cpp:353:3: warning: delete called on 'mozilla::TestStunServer' that has virtual functions but non-virtual destructor [-Wdelete-non-virtual-dtor]
> ../../../../media/mtransport/test/stunserver.cpp:504:3: warning: delete called on 'mozilla::TestStunTcpServer' that has virtual functions but non-virtual destructor [-Wdelete-non-virtual-dtor]
158a171,172
> ../../../dist/include/mozilla/Scoped.h:244:35: warning: delete called on 'mozilla::TestStunServer' that has virtual functions but non-virtual destructor [-Wdelete-non-virtual-dtor]
> ../../../dist/include/mozilla/Scoped.h:244:35: warning: delete called on 'mozilla::TestStunTcpServer' that has virtual functions but non-virtual destructor [-Wdelete-non-virtual-dtor]
Comment on attachment 8401357 [details] [diff] [review]
ICE TCP part 1 patch

Review of attachment 8401357 [details] [diff] [review]:
-----------------------------------------------------------------

Getting pretty close now, although we still aren't utilizing TURN servers to create server reflexive candidates. Also, found a typo bug that needs fixing, and the warnings (in comment 25) need to be cleaned up (this patch does not seem to introduce any new ones).

::: media/mtransport/third_party/nICEr/src/net/nr_socket_multi_tcp.c
@@ -221,5 @@
>            if ((r=nr_socket_multi_tcp_create_stun_server_socket(sock, stun_servers+i, addr, max_pending)))
>              ABORT(r);
>          }
>        }
> -      if (turn_servers) {

Why did we get rid of this? Isn't this what we want to be doing (treating TURN servers as just another STUN server for establishing server reflexive candidates)?

@@ +285,5 @@
> +        ABORT(R_FAILED);
> +
> +      /* find free preallocated socket and connect */
> +      TAILQ_FOREACH(tcp_sock_ctx, &sock->sockets, entry) {
> +        if (!nr_transport_addr_is_wildcard(&tcp_sock_ctx->remote_addr)) {

Pretty sure we don't want the '!' here.

@@ +291,5 @@
> +            continue;
> +          if (preallocated_connect_mode == PREALLOC_CONNECT_FRAMED && !tcp_sock_ctx->is_framed)
> +            continue;
> +
> +          if ((r=nr_socket_connect(tcp_sock_ctx->inner, to))){

Hmm. Let's say this fails in an unrecoverable way; we probably don't want to try using it again on the next call to this function. Should we set remote_addr in all cases? Or maybe we should remove from the list in the failure case?
Attachment #8401357 - Flags: review-
Attached patch ICE-TCP interdiff 2 (obsolete) — — Splinter Review
To be applied on top of ICE TCP part 1 patch. It contains an attempt to fix reported compilation warnings, although I can't test it because my gcc version 4.6.3 doesn't support them. Also, all other remaining issues are addressed, too.
Attachment #8404498 - Flags: review?(docfaraday)
Attachment #8404498 - Flags: review?(docfaraday) → review?(martin.thomson)
Comment on attachment 8404498 [details] [diff] [review]
ICE-TCP interdiff 2

Review of attachment 8404498 [details] [diff] [review]:
-----------------------------------------------------------------

I'm not really familiar enough with this code to comment here.  I think that Byron really needs to look at this.

From back where I'm standing, what concerns me most about this patch is the lack of tests.  I'd like to see some tests that checked that this feature is working.  Tests might also help someone understand the intent of the code and place less burden on review (which - despite a heavy institutional reliance on reviewing - I don't believe in as a method for catching bugs).  It looks relatively easy to expand the existing tests.
Attachment #8404498 - Flags: review?(martin.thomson) → review?(docfaraday)
Hi,

I would like to set expectation that any outstanding reviews need to be fully done by this week instead of trickling in through a long period of time. We will then incorporate what is appropriate for the last patch to be delivered next week.

thanks,
Alright, now that I'm back from vacation, I can give this a look. I think I'll be able to squeeze this in today.
Comment on attachment 8404498 [details] [diff] [review]
ICE-TCP interdiff 2

Review of attachment 8404498 [details] [diff] [review]:
-----------------------------------------------------------------

r+ assuming a nit is fixed. It is now time to fold these interdiffs into the original, re-upload, mark the interdiffs as obsolete, and kick off a try run.

::: media/mtransport/third_party/nICEr/src/ice/ice_candidate.c
@@ +453,1 @@
>          default:

This is a little confusing to me; we assert if TCP_TYPE_MAX, but not if > TCP_TYPE_MAX? Wouldn't it make more sense to break on TCP_TYPE_NONE, assert and ABORT in the default case, and not have a separate case for TCP_TYPE_MAX?
Attachment #8404498 - Flags: review?(docfaraday) → review+
Attached patch ICE-TCP patch (obsolete) — — Splinter Review
I folded both interdiffs into the original and fixed the mentioned nit. I don't have commit level 1 access, therefore someone else has to push it to Try tree.
Attachment #8398368 - Attachment is obsolete: true
Attachment #8401320 - Attachment is obsolete: true
Attachment #8401357 - Attachment is obsolete: true
Attachment #8404498 - Attachment is obsolete: true
Fix wrong includes causing compile errors on B2G Desktop OS X Opt and B2G Desktop Windows Opt
Attached patch Part 3: Fix failing test_ipc from Mochitest3 (obsolete) — — Splinter Review
This one fixes failure of test_ipc.html from Mochitest 3. It failed on Assertion failure: addr->protocol == IPPROTO_UDP at nr_socket_prsock.cpp:935. This assert was added by ICE-TCP.patch because NrSocketIpc is UDP aware only. 

Now, instead of assert it fails with R_FAILED, which is propagated back to nr_ice_component_initialize. Where it is ignored, and resulting in not creating TCP candidates.
On OS X all tests fail because OS X requires to set SO_REUSEPORT option.  The problem is that NSPR doesn't implement this option. This patch includes its implementation but I think it would be better to split this into two patches and to create a new bug for NSPR part.
I'd like to ask someone to push this patches to Try.
IMPORTANT: Please submit NSPR patches separately. They need to be reviewed by different people.

See:
http://www-archive.mozilla.org/owners.html
Attachment #8411835 - Attachment is obsolete: true
Fixed 2 missing cases.
Attachment #8412104 - Attachment is obsolete: true
Attached patch Part 6: Add tests to multi_tcp_socket_unittest (obsolete) — — Splinter Review
Two new tests added.
Attachment #8400225 - Flags: review?(ekr)
Attached patch Part 1: Add support for TCP ICE candidates. (obsolete) — — Splinter Review
Fix commit log
Attachment #8410908 - Attachment is obsolete: true
Comment on attachment 8411815 [details] [diff] [review]
Part 2: Fix build errors on B2G Desktop OS X/Windows Opt

Review of attachment 8411815 [details] [diff] [review]:
-----------------------------------------------------------------

Patch is good, but there are two more similar fixes that must be made in here.

::: media/mtransport/third_party/nICEr/src/stun/nr_socket_buffered_stun.c
@@ +96,5 @@
>    0,
>    0,
> +  nr_socket_buffered_stun_close,
> +  0,
> +  0

Good catch, but there are two more:
media/mtransport/test/stunserver.cpp:157
media/mtransport/third_party/nICEr/src/stun/nr_socket_turn.c:66
Attachment #8411815 - Flags: review-
Comment on attachment 8400225 [details] [diff] [review]
Part 0: A couple of include-what-you-use fixes to the pre-existing code.

Review of attachment 8400225 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm
Attachment #8400225 - Flags: review?(ekr) → review+
(In reply to Byron Campen [:bwc] from comment #45)
> Comment on attachment 8411815 [details] [diff] [review]
> Part 2: Fix build errors on B2G Desktop OS X/Windows Opt
> 
> Review of attachment 8411815 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Patch is good, but there are two more similar fixes that must be made in
> here.
> 
> ::: media/mtransport/third_party/nICEr/src/stun/nr_socket_buffered_stun.c
> @@ +96,5 @@
> >    0,
> >    0,
> > +  nr_socket_buffered_stun_close,
> > +  0,
> > +  0
> 
> Good catch, but there are two more:
> media/mtransport/test/stunserver.cpp:157
> media/mtransport/third_party/nICEr/src/stun/nr_socket_turn.c:66

It was there only to remove warning, the important part was about the includes. But you are right I'll fix the others, too. Also I forgot to increase vtbl version number to 2.
Attachment #8411815 - Attachment is obsolete: true
Comment on attachment 8412839 [details] [diff] [review]
Part 2: Fix build errors on B2G Desktop OS X/Windows Opt

Review of attachment 8412839 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm
Attachment #8412839 - Flags: review+
Comment on attachment 8411828 [details] [diff] [review]
Part 3: Fix failing test_ipc from Mochitest3

Review of attachment 8411828 [details] [diff] [review]:
-----------------------------------------------------------------

Why did we switch the order of ACTIVE and PASSIVE? (I think this change is fine, just don't understand why it was made)

As for indicating "We don't support TCP, sorry" with R_FAILED, this feels hacky to me, but I don't really see a better alternative as far as error codes go. I think the "right" way to do this is to allow ICE TCP to be preffed off, and disable it in the IPC tests. In the meantime, switch this over to using R_REJECTED (or, if you like, add an R_UNSUPPORTED error code and use that), and leave a TODO comment. I can file a separate ticket for preffing ICE TCP on/off (since this is something we want anyway).
Attachment #8411828 - Flags: review-
I switched the order because ACTIVE candidate doesn't create real sockets during its initialization. It creates sockets when it connects to peer candidates. On the other side the PASSIVE one creates listening socket during initialization, which will fail immediately.
Comment on attachment 8412105 [details] [diff] [review]
Part 5: Set PR_SockOpt_Reuseport to fix OS X test failures

Review of attachment 8412105 [details] [diff] [review]:
-----------------------------------------------------------------

Wow. I guess you learn something new every day. This SO_REUSEADDR/SO_REUSEPORT business is really messy. I think we get the correct behavior by setting both on BSD, and SO_REUSEADDR on everything else (although, technically speaking, SO_REUSEADDR on Linux doesn't exactly do what we want; it is only supposed to let us reuse a TCP addr/port when the socket is in TIME_WAIT, but I guess on linux whatever state it is in prior to calling connect is similar enough that it works?).

All that said, this code seems reasonable to me. I think I'll play around with these options on a few platforms to see whether an erratum needs to be filed against http://tools.ietf.org/html/rfc6544#appendix-B
Attachment #8412105 - Flags: review+
Attached patch Part 3: Fix failing test_ipc from Mochitest3 (obsolete) — — Splinter Review
I used R_REJECTED as it is only temporary.
Attachment #8411828 - Attachment is obsolete: true
(In reply to Byron Campen [:bwc] from comment #52)

There's very good answer on stackoverflow about this topic. But you may have already found it.
http://stackoverflow.com/questions/14388706/socket-options-so-reuseaddr-and-so-reuseport-how-do-they-differ-do-they-mean-t
Comment on attachment 8412500 [details] [diff] [review]
Part 6: Add tests to multi_tcp_socket_unittest

Review of attachment 8412500 [details] [diff] [review]:
-----------------------------------------------------------------

Nothing too serious here, but slightly more serious than nits.

::: media/mtransport/test/multi_tcp_socket_unittest.cpp
@@ +70,3 @@
>  
> +    if (stun_server_socket) {
> +      static nr_ice_stun_server stun_server_static;

I'm not sure making this a static is necessary here; I don't think anything actually retains a reference to it. We should be able to just scope it to the function, right? (Please double-check my analysis here, since if I'm wrong we will get undefined behavior)

@@ +86,5 @@
>          (char *)"127.0.0.1", port_s++, IPPROTO_TCP, &local);
>        ASSERT_EQ(0, r);
>  
> +      r = nr_socket_multi_tcp_create(stun_server, stun_server_ct, NULL, 0,
> +          &local, tcp_type, 0, use_framing, 2048, sock);

Hmm. Come to think of it, I'd like to see at least a creation test for TCP_TYPE_SO (even if timing limitations make it difficult to actually use them).

@@ +292,5 @@
> +    '\x00', '\x01', '\x00', '\x08', '\x21', '\x12', '\xa4', '\x42',
> +    '\x9b', '\x90', '\xbe', '\x2c', '\xae', '\x1a', '\x0c', '\xa8',
> +    '\xa0', '\xd6', '\x8b', '\x08', '\x80', '\x28', '\x00', '\x04',
> +    '\xdb', '\x35', '\x5f', '\xaa'
> +  };

This gives the impression that we're actually trying to talk to a STUN server, instead of just going through the motions. Maybe make this obviously fake STUN data, and append "Mockup" to the end of the test name?
Attachment #8412500 - Flags: review-
(In reply to Peter Tatrai from comment #54)
> (In reply to Byron Campen [:bwc] from comment #52)
> 
> There's very good answer on stackoverflow about this topic. But you may have
> already found it.
> http://stackoverflow.com/questions/14388706/socket-options-so-reuseaddr-and-
> so-reuseport-how-do-they-differ-do-they-mean-t

Yeah, that's the one I was looking at.
Comment on attachment 8412936 [details] [diff] [review]
Part 3: Fix failing test_ipc from Mochitest3

Review of attachment 8412936 [details] [diff] [review]:
-----------------------------------------------------------------

Just a nit.

::: media/mtransport/nr_socket_prsock.cpp
@@ +935,1 @@
>    if (addr->protocol != IPPROTO_UDP)

Let's leave a comment here too.
Attachment #8412936 - Flags: review+
Comment on attachment 8412750 [details] [diff] [review]
Part 1: Add support for TCP ICE candidates.

Review of attachment 8412750 [details] [diff] [review]:
-----------------------------------------------------------------

Ok, looks good to me now.
Attachment #8412750 - Flags: review+
Attachment #8412493 - Flags: review?(ted)
Attached patch Part 1: Add support for TCP ICE candidates. (obsolete) — — Splinter Review
Unrotten. I can't make a previous patch version obsolete as I'm not its owner.
Attached patch Part 3: Fix failing test_ipc from Mochitest3 (obsolete) — — Splinter Review
Added requested todo comment in nr_socket_prsock.cpp
Attachment #8412936 - Attachment is obsolete: true
Attached patch Part 6: Add tests to multi_tcp_socket_unittest (obsolete) — — Splinter Review
Incorporated review comments.
Attachment #8412500 - Attachment is obsolete: true
Testing with SO candidates revealed that they stopped working. This patch fixes that.
When testing with "tc qdisc add dev lo root netem delay 5ms" ice_unittest occasionally crashed. The issue was that nr_socket_buffered_stun_writable_cb was called after it had been destroyed, even there is NR_ASYNC_CANCEL(NR_ASYNC_WRITE) in its destroy function.
Attachment #8412750 - Attachment is obsolete: true
Attachment #8413680 - Flags: review+
Attachment #8413683 - Flags: review+
Comment on attachment 8413684 [details] [diff] [review]
Part 6: Add tests to multi_tcp_socket_unittest

Review of attachment 8413684 [details] [diff] [review]:
-----------------------------------------------------------------

::: media/mtransport/test/multi_tcp_socket_unittest.cpp
@@ +335,5 @@
> +
> +TEST_F(MultiTcpSocketTest,  TestConnectTwoSo) {
> +  socks[0] = Create(TCP_TYPE_SO);
> +  socks[1] = Create(TCP_TYPE_SO);
> +  ConnectSo(socks[0], socks[1]);

If this actually works reliably (without fiddling with firewall rules), that's great. I'd be happy with just a test that tries creating them and checks for errors. I'll try taking this for a spin and see.
Attachment #8413684 - Flags: review?(docfaraday)
This test works on my Linux machine. It actually only initiate connections and won't connect. There's still a timing issue and that's why the second test is disabled. It'd be great to schedule a Try run, to see how it behaves on other platforms.
Comment on attachment 8413687 [details] [diff] [review]
Part 7: Fix TCP SO not connecting in nr_socket_multi_tcp_sendto

Review of attachment 8413687 [details] [diff] [review]:
-----------------------------------------------------------------

I think this is functional, although I have a readability nit. I've suggested alternate code that might be more clear, but a comment could do the same job too.

::: media/mtransport/third_party/nICEr/src/net/nr_socket_multi_tcp.c
@@ +409,5 @@
>      int r, _status;
>      nr_socket_multi_tcp *sock = (nr_socket_multi_tcp *)obj;
>      nr_socket *nrsock;
>  
> +    if ((r=nr_socket_multi_tcp_get_sock_connected_to(sock, to, PREALLOC_CONNECT_FRAMED, &nrsock)))

Won't this be called when talking to a TURN server sometimes? I _think_ the logic ends up doing the right thing, but PREALLOC_CONNECT_FRAMED is doubly misleading in this case, since neither are we connecting on a prealloc, nor are we using ICE TCP framing. Maybe what we want instead is something more like PREALLOC_DONT_CONNECT_UNLESS_SO?
Attachment #8413687 - Flags: review+
Comment on attachment 8413688 [details] [diff] [review]
Part 8: Fix socket readable/writable callback called even it was already canceled.

Review of attachment 8413688 [details] [diff] [review]:
-----------------------------------------------------------------

Yikes. Good catch.
Attachment #8413688 - Flags: review+
(In reply to Byron Campen [:bwc] from comment #55)
> Comment on attachment 8412500 [details] [diff] [review]
> Part 6: Add tests to multi_tcp_socket_unittest
> 
> Review of attachment 8412500 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: media/mtransport/test/multi_tcp_socket_unittest.cpp
> @@ +70,3 @@
> >  
> > +    if (stun_server_socket) {
> > +      static nr_ice_stun_server stun_server_static;
> 
> I'm not sure making this a static is necessary here; I don't think anything
> actually retains a reference to it. We should be able to just scope it to
> the function, right? (Please double-check my analysis here, since if I'm
> wrong we will get undefined behavior)
> 

Aah I forgot about, this. I'm going to check it and maybe to attach an another version.
Unit-tests look unhappy. mach cppunittest seems to crash in transport_unittests locally as well.
Attached patch Part 6: Add tests to multi_tcp_socket_unittest (obsolete) — — Splinter Review
Removed static variable.
Attachment #8413684 - Attachment is obsolete: true
Attachment #8413684 - Flags: review?(docfaraday)
Comment on attachment 8413941 [details] [diff] [review]
Part 6: Add tests to multi_tcp_socket_unittest

Review of attachment 8413941 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm
Attachment #8413941 - Flags: review+
Attachment #8412493 - Flags: review?(ted) → review?(wtc)
(In reply to Byron Campen [:bwc] from comment #70)
> Unit-tests look unhappy. mach cppunittest seems to crash in
> transport_unittests locally as well.

This is strange. There is no crash in transport_unittests on TPBL, only 1 failed test on OS X from ice_unittests (IceConnectTest.TestSendReceiveTcp). Anything else looks not related to these patches. Could you give me more info about it?
It seems to me that the issue on OS X is that, when the PR_Write is called before a TCP socket is successfuly connected, it won't return PR_WOULD_BLOCK_ERROR, but some other (eg. PR_NOT_CONNECTED_ERROR or PR_IN_PROGRESS_ERROR). 

I could fix this by checking this error in NrSocket::write and return R_WOULDBLOCK error, or by adding a wait for NR_ASYNC_WRITE after connect in nr_socket_buffered_stun and caching writes while connecting.

I'll prepare a patch for option 2, as I think it's cleaner.
This should fix ice_unittest failure on OS X. I don't have an access to any mac, therefore I'd like to ask someone to push it to Try, or test it locally on Mac. All tests finished successfully on my local Linux.
Attachment #8414399 - Flags: review?(docfaraday)
Renamed PREALLOC_CONNECT_FRAMED to PREALLOC_DONT_CONNECT_UNLESS_SO and fixed conditions to match the name.
Attachment #8413687 - Attachment is obsolete: true
Attachment #8414507 - Attachment is obsolete: true
Comment on attachment 8414526 [details] [diff] [review]
Part 7: Fix TCP SO not connecting in nr_socket_multi_tcp_sendto

Review of attachment 8414526 [details] [diff] [review]:
-----------------------------------------------------------------

I think I was unclear; now we have confusing code in nr_socket_multi_tcp_connect. I was proposing that DONT_CONNECT_UNLESS_SO would be a third value. But, if we want to keep it to two, I think they are STUN_ONLY and ANY_EXCEPT_UNCONNECTED_STUN (pretty sure that is what this boils down to, in plain language, but feel free to double-check me).
Attachment #8414526 - Flags: review-
(In reply to Peter Tatrai from comment #75)
> Created attachment 8414399 [details] [diff] [review]
> Part 9: Add a wait for TCP socket got connected into nr_socket_buffered_stun
> 
> This should fix ice_unittest failure on OS X. I don't have an access to any
> mac, therefore I'd like to ask someone to push it to Try, or test it locally
> on Mac. All tests finished successfully on my local Linux.

I'm still seeing failures on OS X, trying to get a good log of what is happening now.
Ok, there is a failure in BufferedStunSocketTest.TestSendTo that seems to be due to the wait until connected behavior added in patch 9. Let me try the other approach.
Oh sorry, I didn't try buffered_stun_unittests. I'm starting tests manually and forgot about this one. This is failing on Linux too. I fixed this crash (see bellow), but I'm gonna sleep and can't test the rest.

diff --git a/media/mtransport/third_party/nICEr/src/stun/nr_socket_buffered_stun.c b/media/mtransport/third_party/nICEr/src/stun/nr_socket_buffered_stun.c
--- a/media/mtransport/third_party/nICEr/src/stun/nr_socket_buffered_stun.c
+++ b/media/mtransport/third_party/nICEr/src/stun/nr_socket_buffered_stun.c
@@ -385,16 +385,18 @@ static int nr_socket_buffered_stun_conne
 
       if ((r=nr_socket_getfd(sock->inner, &fd)))
         ABORT(r);
 
       NR_ASYNC_WAIT(fd, NR_ASYNC_WAIT_WRITE, nr_socket_buffered_stun_connected_cb, sock);
       ABORT(R_WOULDBLOCK);
     }
     ABORT(r);
+  } else {
+    sock->connected = 1;
   }
 
   _status=0;
 abort:
   return(_status);
 }
 
 static int nr_socket_buffered_stun_write(void *obj,const void *msg, size_t len, size_t *written)
I returned to three values. I think it's more readable.
Attachment #8414526 - Attachment is obsolete: true
Attachment #8415147 - Flags: review?(docfaraday)
Attachment #8415147 - Attachment is obsolete: true
Attachment #8415147 - Flags: review?(docfaraday)
Attachment #8415162 - Flags: review?(docfaraday)
Fixed failing BufferedStunSocketTest.
Attachment #8414399 - Attachment is obsolete: true
Attachment #8414399 - Flags: review?(docfaraday)
Attachment #8415168 - Flags: review?(docfaraday)
Attachment #8415162 - Flags: review?(docfaraday) → review+
(In reply to Peter Tatrai from comment #81)
> Oh sorry, I didn't try buffered_stun_unittests. I'm starting tests manually
> and forgot about this one. This is failing on Linux too. I fixed this crash
> (see bellow), but I'm gonna sleep and can't test the rest.
>      ABORT(r);
> +  } else {
> +    sock->connected = 1;
>    }

Huh. We're managing to connect synchronously in this test? I'll give it a whirl.
Comment on attachment 8414399 [details] [diff] [review]
Part 9: Add a wait for TCP socket got connected into nr_socket_buffered_stun

Review of attachment 8414399 [details] [diff] [review]:
-----------------------------------------------------------------

This looks pretty good to me, although it is a lot of refactoring. The tests seem ok though. Will push to try.

::: media/mtransport/third_party/nICEr/src/stun/turn_client_ctx.c
@@ -406,5 @@
>  
>    return(0);
>  }
>  
> -static int nr_turn_client_connect(nr_turn_client_ctx *ctx)

Wow. If this continues to work, awesome! Lots of deleted code makes me happy.
Attachment #8414399 - Attachment is obsolete: false
Attachment #8414399 - Attachment is obsolete: true
Comment on attachment 8415168 [details] [diff] [review]
Part 9: Add a wait for TCP socket got connected into nr_socket_buffered_stun

Review of attachment 8415168 [details] [diff] [review]:
-----------------------------------------------------------------

A nit and a minor concern.

::: media/mtransport/test/multi_tcp_socket_unittest.cpp
@@ +249,5 @@
>    Connect(socks[1], socks[0]);
>    Connect(socks[2], socks[0]);
>  }
>  
> +TEST_F(MultiTcpSocketTest, TestTransmit) {

Does the PASSIVE socket sending data first work anymore? I guess this won't happen in the field between active and passive, although we do need to make sure that a PASSIVE socket can send data so it can talk to a STUN server (I see you did this in an earlier patch).

::: media/mtransport/third_party/nICEr/src/stun/nr_socket_buffered_stun.c
@@ +152,5 @@
>    if (!(sock->buffer = RMALLOC(sock->buffer_size)))
>      ABORT(R_NO_MEMORY);
>  
>    sock->read_state = NR_ICE_SOCKET_READ_NONE;
> +  sock->connected = 0;

RCALLOC should have zeroed this out already.
Attachment #8415168 - Flags: review?(docfaraday) → review+
Lots of intermittent orange on that try push, but none of it appears to be related to these patches.
Since we still have some reviews to wait on, and since we're going to have to do it for uplift anyway, I'm going to cherry-pick the high priority bug fixes from these patches, and put them in their own patch first in the patch queue, so we can land them sooner rather than later.
(In reply to Byron Campen [:bwc] from comment #87)

> Does the PASSIVE socket sending data first work anymore? I guess this won't
> happen in the field between active and passive, although we do need to make
> sure that a PASSIVE socket can send data so it can talk to a STUN server (I
> see you did this in an earlier patch).

It worked because there was sleep in Connect_s, which I removed in this version. Now, without the sleep, it would try to send data from PASSIVE socket which didn't get connected yet, and it doesn't have buffered_stun socket created for this remote addr to cache the data. 

I removed this direction of tests because this case is not used in ICE-TCP. But if you want this test, I'll add one with a delay between connect and send from PASSIVE.
I forgot to mention, that there is already a test for a PASSIVE socket connecting to STUN server (MultiTcpSocketTest.TestActivePassiveWithStunServerMockup)
Update commit message numbering.
Attachment #8400225 - Attachment is obsolete: true
Assignee: nobody → docfaraday
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Break a null-pointer deref fix out into a separate patch.
Attached patch Part 5: Add support for TCP ICE candidates. (obsolete) — — Splinter Review
Roll remaining patches (except for the NSPR stuff) into one patch.
Attachment #8413680 - Attachment is obsolete: true
Attachment #8415168 - Attachment is obsolete: true
Attachment #8415162 - Attachment is obsolete: true
Attachment #8413941 - Attachment is obsolete: true
Attachment #8413688 - Attachment is obsolete: true
Attachment #8413683 - Attachment is obsolete: true
Attachment #8412839 - Attachment is obsolete: true
Attachment #8412105 - Attachment is obsolete: true
Attachment #8416622 - Flags: review+
Comment on attachment 8416623 [details] [diff] [review]
Part 2: (Upliftable) Fix bugs where PR_WOULD_BLOCK_ERROR (or, in some cases, PR_NOT_CONNECTED_ERROR while a TCP socket was connecting) would cause sockets to be abandoned for no good reason (see also bug 985493 and 1001671).

I need to ask Nils to try out this patch and see if it works well with TURN TCP, before landing it.
Attachment #8416623 - Flags: review+
Comment on attachment 8416624 [details] [diff] [review]
Part 3: (Upliftable) Fix bug where we weren't performing a null check in dtor for NrSocket (see also bug 987380).

[Approval Request Comment]
Bug caused by (feature/regressing bug #): 

   Pretty sure this has been around since webrtc landed.

User impact if declined:

   Null pointer deref in some rare cases when starting a WebRTC call.

Testing completed (on m-c, etc.): 

   The usual CI tests.

Risk to taking this patch (and alternatives if risky): 

   None.

String or IDL/UUID changes made by this patch:

   None.
Attachment #8416624 - Flags: review+
Attachment #8416624 - Flags: checkin?
Attachment #8416624 - Flags: approval-mozilla-aurora?
Attachment #8416625 - Flags: review+
Attachment #8416622 - Flags: checkin?
Assignee: docfaraday → ptatrai
Comment on attachment 8416622 [details] [diff] [review]
Part 1: A couple of include-what-you-use fixes to the pre-existing code.

https://hg.mozilla.org/integration/mozilla-inbound/rev/44790b5e1976
Attachment #8416622 - Flags: checkin? → checkin+
Comment on attachment 8416624 [details] [diff] [review]
Part 3: (Upliftable) Fix bug where we weren't performing a null check in dtor for NrSocket (see also bug 987380).

https://hg.mozilla.org/integration/mozilla-inbound/rev/1c40e473da4c
Attachment #8416624 - Flags: checkin? → checkin+
https://hg.mozilla.org/mozilla-central/rev/44790b5e1976
https://hg.mozilla.org/mozilla-central/rev/1c40e473da4c
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla32
Attachment #8416624 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Keywords: leave-open
Comment on attachment 8412493 [details] [diff] [review]
Part 4: Add implementation of socket option PR_SockOpt_Reuseport into NSPR

Review of attachment 8412493 [details] [diff] [review]:
-----------------------------------------------------------------

Thank you for the patch. I have some questions.

::: nsprpub/pr/include/prio.h
@@ +211,5 @@
>      PR_SockOpt_NoDelay,         /* don't delay send to coalesce packets */
>      PR_SockOpt_MaxSegment,      /* maximum segment size */
>      PR_SockOpt_Broadcast,       /* enable broadcast */
> +    PR_SockOpt_Reuseport,       /* allow duplicate bindings on
> +                                   platforms that support it */

The second comment line should start with a '*'. I'll take care of this
when I check in this patch.

Could you provide a better description than "allow duplicate bindings"?
At least, we should use "allow local port reuse", which you used in
pr/tests/sockopt.c. Ideally, the description should explain the difference
from SO_REUSEADDR. Is this socket FAQ page accurate?
http://www.unixguide.net/network/socketfaq/4.11.shtml

@@ +247,5 @@
>          PRMcastRequest add_member;  /* add an IP group membership */
>          PRMcastRequest drop_member; /* Drop an IP group membership */
>          PRNetAddr mcast_if;         /* multicast interface address */
> +        PRBool reuse_port;          /* Allow duplicate bindings on
> +                                       platforms that support it */

The ordering of union members doesn't matter, so we should list this
member with the other PRBool members. I suggest listing it right after
reuse_addr.  I'll take care of this when I check in this patch.

::: nsprpub/pr/tests/sockopt.c
@@ +130,5 @@
>                      fd = udp; 
>                      data.value.broadcast = PR_TRUE;         
>                      break;    
>  #endif
> +#if defined(DARWIN) || defined(FREEBSD) || defined(NETBSD) || defined(OPENBSD)

Can we include <sys/socket.h> and just test

    #ifdef SO_REUSEPORT

here?

This change also points out a problem with PR_SockOpt_Reuseport:
how would an NSPR user know if PR_SockOpt_Reuseport is supported?
And what should the user do if PR_SockOpt_Reuseport is not supported?
Attached patch Part 5: Add support for TCP ICE candidates. (obsolete) — — Splinter Review
Unrot.
Attachment #8416625 - Attachment is obsolete: true
Assignee: ptatrai → docfaraday
Status: REOPENED → ASSIGNED
Comment on attachment 8419086 [details] [diff] [review]
Part 5: Add support for TCP ICE candidates.

Carry forward r+
Attachment #8419086 - Flags: review+
(In reply to Wan-Teh Chang from comment #103)
> Comment on attachment 8412493 [details] [diff] [review]
> Part 4: Add implementation of socket option PR_SockOpt_Reuseport into NSPR
> 
> Review of attachment 8412493 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Thank you for the patch. I have some questions.
> 
> ::: nsprpub/pr/include/prio.h
> @@ +211,5 @@
> >      PR_SockOpt_NoDelay,         /* don't delay send to coalesce packets */
> >      PR_SockOpt_MaxSegment,      /* maximum segment size */
> >      PR_SockOpt_Broadcast,       /* enable broadcast */
> > +    PR_SockOpt_Reuseport,       /* allow duplicate bindings on
> > +                                   platforms that support it */
> 
> The second comment line should start with a '*'. I'll take care of this
> when I check in this patch.
> 
> Could you provide a better description than "allow duplicate bindings"?
> At least, we should use "allow local port reuse", which you used in

We can use a same one as it is used for SO_REUSEPORT macro in sys/socket.h:

#define	SO_REUSEPORT	0x0200		/* allow local address & port reuse */

> pr/tests/sockopt.c. Ideally, the description should explain the difference
> from SO_REUSEADDR. Is this socket FAQ page accurate?
> http://www.unixguide.net/network/socketfaq/4.11.shtml
More accurate explanation is this one
http://stackoverflow.com/questions/14388706/socket-options-so-reuseaddr-and-so-reuseport-how-do-they-differ-do-they-mean-t

Generally, setting a flag SO_REUSEADDR on platforms without SO_REUSEPORT support, behaves similarly to a setting both SO_REUSEADDR and SO_REUSEPORT flags on the other platforms.

> 
> @@ +247,5 @@
> >          PRMcastRequest add_member;  /* add an IP group membership */
> >          PRMcastRequest drop_member; /* Drop an IP group membership */
> >          PRNetAddr mcast_if;         /* multicast interface address */
> > +        PRBool reuse_port;          /* Allow duplicate bindings on
> > +                                       platforms that support it */
> 
> The ordering of union members doesn't matter, so we should list this
> member with the other PRBool members. I suggest listing it right after
> reuse_addr.  I'll take care of this when I check in this patch.
> 
> ::: nsprpub/pr/tests/sockopt.c
> @@ +130,5 @@
> >                      fd = udp; 
> >                      data.value.broadcast = PR_TRUE;         
> >                      break;    
> >  #endif
> > +#if defined(DARWIN) || defined(FREEBSD) || defined(NETBSD) || defined(OPENBSD)
> 
> Can we include <sys/socket.h> and just test
> 
>     #ifdef SO_REUSEPORT
> 
> here?
> 

Yes, it'd be better.

> This change also points out a problem with PR_SockOpt_Reuseport:
> how would an NSPR user know if PR_SockOpt_Reuseport is supported?
> And what should the user do if PR_SockOpt_Reuseport is not supported?

He can try to set this option and if it is not supported PR_GetError will return PR_OPERATION_NOT_SUPPORTED_ERROR after PR_SetSocketOption call. I think that typical use case is that he sets PR_SockOpt_Reuseaddr first, then PR_SockOpt_Reuseport and ignores PR_OPERATION_NOT_SUPPORTED_ERROR on the last one.
Any updates on this?
I've done some manual TURN TCP testing with part 2 applied on OS X and built ASAN, and it seems to be behaving itself, although I have not succeeded in eliciting a WOULDBLOCK error. I'm going to do one final try push, and request checkin of part 2 if this all works out.

https://tbpl.mozilla.org/?tree=Try&rev=628c9ca7c2e8
Hmm, seeing a failure in multi_tcp_socket_unittest. I also see some weird mochitest failures that don't look like our fault, but need to check. Another try run, with just the wouldblock fixes. FWIW, I've run the entire patch queue with some ipfw nastiness on OS X ASAN, and it seems to work ok.

https://tbpl.mozilla.org/?tree=Try&rev=82c274d430b5
Comment on attachment 8412493 [details] [diff] [review]
Part 4: Add implementation of socket option PR_SockOpt_Reuseport into NSPR

Review of attachment 8412493 [details] [diff] [review]:
-----------------------------------------------------------------

::: nsprpub/pr/tests/sockopt.c
@@ +132,5 @@
>                      break;    
>  #endif
> +#if defined(DARWIN) || defined(FREEBSD) || defined(NETBSD) || defined(OPENBSD)
> +                case PR_SockOpt_Reuseport:
> +                    data.value.reuse_addr = PR_TRUE;

This should say data.value.reuse_port. I will fix this
when I check this in.
I edited Peter Tatrai's patch and checked it in to the NSPR upstream
repository: https://hg.mozilla.org/projects/nspr/rev/3db759f28449
Attachment #8412493 - Attachment is obsolete: true
Attachment #8412493 - Flags: review?(wtc)
Attachment #8427300 - Flags: review+
Attachment #8427300 - Flags: checkin+
Comment on attachment 8427300 [details] [diff] [review]
Part 4: Add implementation of socket option PR_SockOpt_Reuseport into NSPR, v2, by Peter Tatrai

Patch pushed to mozilla-inbound as part of NSPR_4_10_6_BETA1:
https://hg.mozilla.org/integration/mozilla-inbound/rev/fbab35dfe7e3
Comment on attachment 8416623 [details] [diff] [review]
Part 2: (Upliftable) Fix bugs where PR_WOULD_BLOCK_ERROR (or, in some cases, PR_NOT_CONNECTED_ERROR while a TCP socket was connecting) would cause sockets to be abandoned for no good reason (see also bug 985493 and 1001671).

Review of attachment 8416623 [details] [diff] [review]:
-----------------------------------------------------------------

I've done quite a bit of manual testing on this patch on ASAN, and I'm pretty sure it's good, although I don't know if it makes a large enough difference to uplift (I've had no luck reproducing the error condition it guards against).
Attachment #8416623 - Flags: checkin?
I'm ready to check this in once the tree opens
Attached patch Part 5: Add support for TCP ICE candidates. (obsolete) — — Splinter Review
Unrot.
Attachment #8419086 - Attachment is obsolete: true
Attachment #8416623 - Flags: checkin? → checkin+
Any idea what is going on here? This will need to be fixed before this lands.

https://tbpl.mozilla.org/php/getParsedLog.php?id=40162266&tree=Try
Flags: needinfo?(ptatrai)
(In reply to Byron Campen [:bwc] from comment #118)
> Any idea what is going on here? This will need to be fixed before this lands.
> 
> https://tbpl.mozilla.org/php/getParsedLog.php?id=40162266&tree=Try

The test failure is occasional. I'm trying to reproduce failure on TPBL cause I don't have access to OS X:
https://tbpl.mozilla.org/?tree=Try&rev=d8a0a4057442

Also I'll add a patch with some more logging to get deeper info about it. I consider using gtest_ting_buffer_dumper, but I'm doing something wrong because it doesn't dump anything. I manually caused test failure and nothing was dumped.
Flags: needinfo?(ptatrai)
Added a log dumper and moved a call to SetReadable(false) just behind a ASSERT_TRUE_WAIT(IsReadable()).
Attachment #8434058 - Flags: review?(docfaraday)
Try run with the latest patches:
https://tbpl.mozilla.org/?tree=Try&rev=16b3510d390b

Everything is green. I was trying to get a test failure for a week and got nothing.
Comment on attachment 8434058 [details] [diff] [review]
Part 6: Add log dumper to multi_tcp_socket_unittest

Review of attachment 8434058 [details] [diff] [review]:
-----------------------------------------------------------------

Mostly logging nits, but I definitely want to see the call to SetReadable() moved.

::: media/mtransport/test/multi_tcp_socket_unittest.cpp
@@ +210,5 @@
>  
>    void RecvData(nr_socket *expected_from, nr_socket *sent_to,
>                  const char *expected_data, size_t expected_len) {
>      ASSERT_TRUE_WAIT(IsReadable(), 1000);
> +    SetReadable(false);

Hmm. We should probably only set this on STS, so move this to RecvData_s. Although, if the polling implementation is edge-triggered, even this won't work. Let me try to find out.

::: media/mtransport/third_party/nICEr/src/net/nr_socket_multi_tcp.c
@@ +113,5 @@
>  
>      _status=0;
>    abort:
> +    if (_status)
> +      r_log(LOG_ICE,LOG_DEBUG,"%s:%d function %s failed with error %d",__FILE__,__LINE__,__FUNCTION__,_status);

Couldn't hurt to log addr->as_string here.

@@ +186,5 @@
>      }
>      _status=0;
>    abort:
> +    if (_status)
> +      r_log(LOG_ICE,LOG_DEBUG,"%s:%d function %s failed with error %d",__FILE__,__LINE__,__FUNCTION__,_status);

Couldn't hurt to log addr->as_string. Might also be useful to log stun_server_addr if nr_socket_connect fails above.

@@ +349,5 @@
>  
>      _status=0;
>    abort:
>      if (_status && tcp_sock_ctx) {
> +      r_log(LOG_ICE,LOG_DEBUG,"%s:%d function %s failed with error %d",__FILE__,__LINE__,__FUNCTION__,_status);

We probably want to log something if tcp_sock_ctx is not set, and if it is set, we probably want to log its addr.

@@ +374,5 @@
>  
>      _status=0;
>    abort:
> +    if (_status)
> +      r_log(LOG_ICE,LOG_DEBUG,"%s:%d function %s failed with error %d",__FILE__,__LINE__,__FUNCTION__,_status);

Let's log addr->as_string too

@@ +430,5 @@
>  
>      _status=0;
>    abort:
> +    if (_status)
> +      r_log(LOG_ICE,LOG_DEBUG,"%s:%d function %s failed with error %d",__FILE__,__LINE__,__FUNCTION__,_status);

Log to->as_string here.

@@ +459,5 @@
>          }
>  
>          TAILQ_REMOVE(&sock->sockets, tcpsock, entry);
>          nr_tcp_socket_ctx_destroy(&tcpsock);
> +        r_log(LOG_ICE,LOG_DEBUG,"%s:%d function %s failed with error %d",__FILE__,__LINE__,__FUNCTION__,_status);

Let's log from->as_string and sock->addr.as_string here.

@@ +514,5 @@
>  
>      _status=0;
>  abort:
> +    if (_status)
> +      r_log(LOG_ICE,LOG_DEBUG,"%s:%d function %s failed with error %d",__FILE__,__LINE__,__FUNCTION__,_status);

Log addr->as_string

@@ +568,5 @@
>  
>      _status=0;
>    abort:
> +    if (_status)
> +      r_log(LOG_ICE,LOG_DEBUG,"%s:%d function %s failed with error %d",__FILE__,__LINE__,__FUNCTION__,_status);

Let's log sock->addr.as_string too.
Attachment #8434058 - Flags: review?(docfaraday) → review-
Trying this test out on my machine, I'm seeing very frequent failures. Let me try to figure out why.
I'm seeing very strange behavior here. SYN going unanswered (almost every time in MultiTcpSocketTest.TestTwoActiveBidirectionalTransmit, on the second attempted connection to the same listen port), SYN followed immediately by a RST in the same direction (occasionally in MultiTcpSocketTest.TestTwoPassiveBidirectionalTransmit on the second connection attempt from the same port), and lots of errors when the test is run twice in rapid succession (maybe this has to do with reuseaddr/port?).
I've pared things down to a test-case involving only NrSocket. And trying to connect two TCP clients to a single TCP server socket seems to fail every time; only a single readable callback is ever fired on the server socket (for client 2). Strangely, both of the client sockets see writable callbacks. The pcap shows both TCP handshakes, but the server sends RST to client 1 as soon as the ACK for client 2 comes in. I've attempted to find places in the codebase that actually use PR_Listen/PR_Accept, and have only found a handful of places:

Some mochitest-related stuff here: http://dxr.mozilla.org/mozilla-central/source/testing/mochitest/ssltunnel/ssltunnel.cpp#1364

A server socket class (the only place I can find that uses this is some js-based unit-tests, looking into these to see if they're easy to tweak) here: http://dxr.mozilla.org/mozilla-central/source/netwerk/base/src/nsServerSocket.cpp#362

A unit-test related class here: http://dxr.mozilla.org/mozilla-central/source/security/manager/ssl/tests/unit/tlsserver/lib/TLSServer.cpp#280

I get the sinking feeling that back-to-back connections has not really been tested.
The JS-based unit-tests seem to only do a single connection at a time.
Ok, so we definitely need to increase the backlog size, but I'm still having trouble with address in use errors raised when we try to _connect_ (not when we try to _bind_). I suspect that we're running into a conflict on the 5-tuple, which no amount of SO_REUSEADDR/SO_REUSEPORT will save us from, although looking at pcaps both sides have sent FIN, so I'm not sure why the state would still be around. We might be able to work around this by picking a random starting port instead of 40000. Let me try.
Let's see if this works (needinfo self to check back on try push):

https://tbpl.mozilla.org/?tree=Try&rev=b6a371045b43
Flags: needinfo?(docfaraday)
(In reply to Byron Campen [:bwc] from comment #129)
> Ok, so we definitely need to increase the backlog size, but I'm still having
> trouble with address in use errors raised when we try to _connect_ (not when
> we try to _bind_). I suspect that we're running into a conflict on the
> 5-tuple, which no amount of SO_REUSEADDR/SO_REUSEPORT will save us from,
> although looking at pcaps both sides have sent FIN, so I'm not sure why the
> state would still be around. We might be able to work around this by picking
> a random starting port instead of 40000. Let me try.

Which platform are you testing on?
If it is Mac: are you aware that Apple has a bug in the Mavericks TCP/IP stack where it leaves closed connection hanging around?
(In reply to Nils Ohlmeier [:drno] from comment #131)
> (In reply to Byron Campen [:bwc] from comment #129)
> > Ok, so we definitely need to increase the backlog size, but I'm still having
> > trouble with address in use errors raised when we try to _connect_ (not when
> > we try to _bind_). I suspect that we're running into a conflict on the
> > 5-tuple, which no amount of SO_REUSEADDR/SO_REUSEPORT will save us from,
> > although looking at pcaps both sides have sent FIN, so I'm not sure why the
> > state would still be around. We might be able to work around this by picking
> > a random starting port instead of 40000. Let me try.
> 
> Which platform are you testing on?
> If it is Mac: are you aware that Apple has a bug in the Mavericks TCP/IP
> stack where it leaves closed connection hanging around?

I'm on 10.8 still.
Comment on attachment 8416624 [details] [diff] [review]
Part 3: (Upliftable) Fix bug where we weren't performing a null check in dtor for NrSocket (see also bug 987380).

The uplift is on Monday, so it doesn't look likely that this bug is going to make it onto Aurora in time. Please request beta approval on this patch (and aurora/beta approval on other patches as needed) once everything in this bug is ready for uplift.
Attachment #8416624 - Flags: approval-mozilla-aurora+
Fix some intermittent failures.
Comment on attachment 8437004 [details] [diff] [review]
Part 7: Fix some intermittent failures in multi_tcp_socket_unittest.

Review of attachment 8437004 [details] [diff] [review]:
-----------------------------------------------------------------

::: media/mtransport/test/multi_tcp_socket_unittest.cpp
@@ +76,5 @@
>    void Create_s(nr_socket_tcp_type tcp_type, nr_socket *stun_server_socket,
>                  int use_framing, nr_socket **sock) {
>      nr_transport_addr local;
> +    // Get start of port range for test
> +    static unsigned short port_s = GetRandomPort();

Subsequent runs of this test case failed because of EADDRINUSE in _connect_. This is probably because of conflicts in the 5-tuple, which SO_REUSEADDR/SO_RUESPORT do nothing about. So, we pick a random starting point, and use subsequent ports in the ephemeral range.

@@ +126,5 @@
>      nr_transport_addr addr;
>      int r=nr_socket_getaddr(sock, &addr);
>      ASSERT_EQ(0, r);
>      printf("Listen on %s\n", addr.as_string);
> +    r = nr_socket_listen(sock, 5);

Lots of places in this test where we need a larger backlog.

@@ +199,5 @@
>    }
>  
>    void RecvData_s(nr_socket *expected_from, nr_socket *sent_to,
>                    const char *expected_data, size_t expected_len) {
> +    SetReadable(false);

Make this less racy.
Attachment #8437004 - Flags: review?(ekr)
I think the one remaining thing we need to fix is the backlog used in nr_ice_component_create_tcp_host_candidate; 1 is probably too low.
Flags: needinfo?(docfaraday) → needinfo?(ptatrai)
Comment on attachment 8437004 [details] [diff] [review]
Part 7: Fix some intermittent failures in multi_tcp_socket_unittest.

Review of attachment 8437004 [details] [diff] [review]:
-----------------------------------------------------------------

I have not reviewed the rest of the patch but this should be fine with the change
below.

::: media/mtransport/test/multi_tcp_socket_unittest.cpp
@@ +64,5 @@
> +  static uint16_t GetRandomPort() {
> +    struct timeval now;
> +    gettimeofday(&now, NULL);
> +    srand(now.tv_usec);
> +    return rand() % 65536;

Why invent new PRNG stuff. NSS is live and you can just get random bytes out of that. Sure it's overkill, but it should be fine
Attachment #8437004 - Flags: review?(ekr) → review+
Target Milestone: mozilla32 → mozilla33
Use NSS to choose a random starting port instead of rand()
Attachment #8437004 - Attachment is obsolete: true
Comment on attachment 8437261 [details] [diff] [review]
Part 7: Fix some intermittent failures in multi_tcp_socket_unittest.

Review of attachment 8437261 [details] [diff] [review]:
-----------------------------------------------------------------

Carry forward r=ekr
Attachment #8437261 - Flags: review+
Unrot. Please make previous patch obsolete.
Attachment #8439210 - Attachment description: Bug 891551 - Part 5: Add support for TCP ICE candidates. r=bwc → Part 5: Add support for TCP ICE candidates. r=bwc
- review comments addressed
- removed change of SetReadable as it is correctly addressed in Part 7 commmit
Attachment #8434058 - Attachment is obsolete: true
Flags: needinfo?(ptatrai)
unrot, please make previous version obsolete
Attachment #8439219 - Flags: review?(docfaraday)
Comment on attachment 8439219 [details] [diff] [review]
Part 8: Increase and make backlog value configureable.

Review of attachment 8439219 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me.
Attachment #8439219 - Flags: review?(docfaraday) → review+
Attachment #8430232 - Attachment is obsolete: true
Comment on attachment 8439210 [details] [diff] [review]
Part 5: Add support for TCP ICE candidates. r=bwc

Review of attachment 8439210 [details] [diff] [review]:
-----------------------------------------------------------------

Carry forward r+
Attachment #8439210 - Flags: review+
Comment on attachment 8439213 [details] [diff] [review]
Part 6: Add log dumper to multi_tcp_socket_unittest

Review of attachment 8439213 [details] [diff] [review]:
-----------------------------------------------------------------

Just typos, but they're pretty serious.

::: media/mtransport/third_party/nICEr/src/net/nr_socket_multi_tcp.c
@@ +177,5 @@
>        nr_transport_addr stun_server_addr;
>  
>        nr_transport_addr_copy(&stun_server_addr, &stun_server->u.addr);
>        r=nr_socket_connect(tcp_socket_ctx->inner, &stun_server_addr);
> +      if (!r && r!=R_WOULDBLOCK) {

This shouldn't be '!r', completely missed this first time around, sorry about that...

@@ +380,5 @@
>  
>      _status=0;
>    abort:
> +    if (_status)
> +      r_log(LOG_ICE,LOG_DEBUG,"%s:%d function %s(addr:%) failed with error %d",__FILE__,__LINE__,__FUNCTION__,addr->as_string,_status);

This should be 'addr:%s'

@@ +436,5 @@
>  
>      _status=0;
>    abort:
> +    if (_status)
> +      r_log(LOG_ICE,LOG_DEBUG,"%s:%d function %s(to:%) failed with error %d",__FILE__,__LINE__,__FUNCTION__,to->as_string,_status);

This should be 'to:%s'

@@ +465,5 @@
>          }
>  
>          TAILQ_REMOVE(&sock->sockets, tcpsock, entry);
>          nr_tcp_socket_ctx_destroy(&tcpsock);
> +        r_log(LOG_ICE,LOG_DEBUG,"%s:%d function %s(from:%) failed with error %d",__FILE__,__LINE__,__FUNCTION__,from->as_string,_status);

Another similar typo here.

@@ +520,5 @@
>  
>      _status=0;
>  abort:
> +    if (_status)
> +      r_log(LOG_ICE,LOG_DEBUG,"%s:%d function %s(addr:%) failed with error %d",__FILE__,__LINE__,__FUNCTION__,addr->as_string,_status);

And here.
Attachment #8439213 - Flags: review-
Attachment #8437261 - Attachment is obsolete: true
Attachment #8439215 - Attachment is obsolete: true
Comment on attachment 8439265 [details] [diff] [review]
Part 7: Fix some intermittent failures in multi_tcp_socket_unittest.

Review of attachment 8439265 [details] [diff] [review]:
-----------------------------------------------------------------

Carry forward r=ekr
Attachment #8439265 - Flags: review+
(In reply to Byron Campen [:bwc] from comment #147)
> Comment on attachment 8439213 [details] [diff] [review]
> Part 6: Add log dumper to multi_tcp_socket_unittest
> 
> Review of attachment 8439213 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Just typos, but they're pretty serious.
> 

Oops, I'm really sorry. I was in hurry and didn't double-check it first.
Attachment #8439213 - Attachment is obsolete: true
Attachment #8439275 - Flags: review?(docfaraday)
Comment on attachment 8439275 [details] [diff] [review]
Part 6: Add log dumper to multi_tcp_socket_unittest

Review of attachment 8439275 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm
Attachment #8439275 - Flags: review?(docfaraday) → review+
Ok, going to do one last try push now.

https://tbpl.mozilla.org/?tree=Try&rev=96176709dc69
Lots of known intermittents, but no failures that look new.
Priority: -- → P1
So what do we need to do wrt sec review here?
Flags: needinfo?(cdiehl)
We did not set up a separate feature sec-review for this feature. The sec-review flag mainly indicates that fuzzing will be a sufficient task for this feature. 
My pipeline is currently stuffed with NFC related work and other fuzzing projects. I did some early fuzzing some weeks ago and adjusted the fuzzing module for the SDP but a more detailed look will only happen in the next quarter.
Flags: needinfo?(cdiehl)
(In reply to Christoph Diehl [:cdiehl] from comment #156)
> We did not set up a separate feature sec-review for this feature. The
> sec-review flag mainly indicates that fuzzing will be a sufficient task for
> this feature. 
> My pipeline is currently stuffed with NFC related work and other fuzzing
> projects. I did some early fuzzing some weeks ago and adjusted the fuzzing
> module for the SDP but a more detailed look will only happen in the next
> quarter.

   So, does that mean we can go ahead and land this?
^
Flags: needinfo?(cdiehl)
That's what we did in the past with other features. Does not mean I feel comfortable with that decision.
Flags: needinfo?(cdiehl)
(In reply to Christoph Diehl [:cdiehl] from comment #159)
> That's what we did in the past with other features. Does not mean I feel
> comfortable with that decision.

Do you have another proposal? It seems impractical to have features wait
3 months for landing because security can't get to them.
Flags: needinfo?(cdiehl)
Yes, developers can use our tools to test their features before landing them.
From this bug ekr and jesup have access to Framboise and Peach which we would use to test such features.
Flags: needinfo?(cdiehl)
Hmm... I'm not sure how realistic that is.

Do you have an example program which demonstrates fuzzing ICE with
these tools?
Flags: needinfo?(cdiehl)
The fuzzer is available here:
https://github.com/posidron/peach/tree/experimental

The specific pit we used till now is located here:
https://github.com/posidron/pits/blob/master/Protocols/SDP/sdp.xml

Samples are located here:
(Please use master for the "Resources" in case the jgit step described in the README.md does not work)
https://github.com/posidron/peach/tree/master/Resources/Samples/network/sdp

A target configuration file for Firefox is here:
https://github.com/posidron/pits/blob/master/Targets/Firefox/firefox.xml

A typical command to run the SDP module would be:
./peach.py -pit Pits/Protocols/SDP.xml -target Pits/Targets/firefox.xml -run Browser

I admit that it requires a small learning curve but it is worth the time in my opinion, especially because developers have a more in depth knowledge of the underlying feature than we have.
Flags: needinfo?(cdiehl)
This appears to only fuzz SDP. What's required here is fuzzing STUN and ICE, which is
a different problem. Do you have anything that does that?
Byron, please take a look at this and see if it's useful.
Flags: needinfo?(docfaraday)
All of those github links are giving me a 404, and searching for "posidron" yields no results.
Flags: needinfo?(docfaraday)
Ok, found your user page, but not seeing pits or peach there.
Looks like I need permission to use these repos?
Flags: needinfo?(cdiehl)
My github name is docfaraday.
I have added you to both repositories "peach" and "pits". The pits are confidential and are not allowed to be shared.
Flags: needinfo?(cdiehl)
Attached patch Part 5: Add support for TCP ICE candidates. (obsolete) — — Splinter Review
Unrot
Attachment #8439210 - Attachment is obsolete: true
Attachment #8439219 - Attachment is obsolete: true
Attachment #8445270 - Flags: review+
Attachment #8445272 - Flags: review+
Target Milestone: mozilla33 → mozilla34
We still want to land this in Fx 34, but other critical bugs are competing for priority.
Priority: P1 → P2
@Maire Reavy, I have checked with Fx 34, this bug is still not fixed. When the patch will be landed?
Assignee: docfaraday → drno
Attached patch Part 5: Add support for TCP ICE candidates. (obsolete) — — Splinter Review
Un-bitrotted.

Note: with this patch applied ice loopback seems to be broken and I get intermittent SIGPIPE errors when running the ice unittest. I still need to solve these.
Attachment #8445270 - Attachment is obsolete: true
un-bitrotted.
Attachment #8445272 - Attachment is obsolete: true
Attachment #8574449 - Flags: review+
Attachment #8574448 - Flags: review+
Depends on: 1136051
(In reply to Nils Ohlmeier [:drno] from comment #177)
> Note: with this patch applied ice loopback seems to be broken and I get
> intermittent SIGPIPE errors when running the ice unittest. I still need to
> solve these.

Intermittent error will get solved through bug 1136051.
Loopback support is broken as the TCP sockets try to connect to STUN server from the loopback interface. Unfortunately the signaling_unittest now these days utilize ICE over loopback. So this needs to get fixed.
ipc tests are hitting an assertion:

[Child 66985] WARNING: 'addr->protocol != IPPROTO_UDP', file /Users/nohlmeier/src/mozilla-central/media/mtransport/nr_socket_prsock.cpp, line 1004
Assertion failure: false (NrSocket over TCP is not e10s ready, see Bug 950660), at /Users/nohlmeier/src/mozilla-central/media/mtransport/nr_socket_prsock.cpp:1005
I have a patch for dealing with loopback, but as it turns out the code in general here is not happy if I have my VMware running, because it opens sockets on the network interfaces for VMware and fails to connect these to the given STUN server on the Internet. So I probably need to adjust my patch so that connection failures to a given STUN server are not treated as final errors during gathering.

Another interesting observation: Mozilla's STUN server does not support TCP, which means Firefox will fail to gather TCP server reflexive TCP candidates with the current default STUN server.
(In reply to Nils Ohlmeier [:drno] from comment #181)
> I have a patch for dealing with loopback, but as it turns out the code in
> general here is not happy if I have my VMware running, because it opens
> sockets on the network interfaces for VMware and fails to connect these to
> the given STUN server on the Internet. So I probably need to adjust my patch
> so that connection failures to a given STUN server are not treated as final
> errors during gathering.

You will likely need to open one socket per local address per STUN server


> Another interesting observation: Mozilla's STUN server does not support TCP,
> which means Firefox will fail to gather TCP server reflexive TCP candidates
> with the current default STUN server.

Did't we disable the default STUN server.
(In reply to Eric Rescorla (:ekr) from comment #182)
> (In reply to Nils Ohlmeier [:drno] from comment #181)
> > I have a patch for dealing with loopback, but as it turns out the code in
> > general here is not happy if I have my VMware running, because it opens
> > sockets on the network interfaces for VMware and fails to connect these to
> > the given STUN server on the Internet. So I probably need to adjust my patch
> > so that connection failures to a given STUN server are not treated as final
> > errors during gathering.
> 
> You will likely need to open one socket per local address per STUN server

The current patch does that already. I meant that each of these a connection failure needs to be a soft failure, which does not result a failure for gathering.
 
> > Another interesting observation: Mozilla's STUN server does not support TCP,
> > which means Firefox will fail to gather TCP server reflexive TCP candidates
> > with the current default STUN server.
> 
> Did't we disable the default STUN server.

We disabled them in the mochitests. Does not look like they are disabled in the unit tests. (And I think this combination makes sense as the mochitests do not test specific behaviors with STUN servers (they should work without STUN servers), but some of the unit tests are specifically written for testing behavior with STUN servers.)

Just checked with a fresh profile on latest Nightly: it still has 'stun.services.mozilla.com' set as its default stun server.
See Also: → 1143827
Attached patch Part 5: Add support for TCP ICE candidates. (obsolete) — — Splinter Review
Unrot.
Attachment #8574448 - Attachment is obsolete: true
Attachment #8578213 - Flags: review+
Attached patch Part 5: Add support for TCP candiates (obsolete) — — Splinter Review
un-bitrot
Attachment #8578213 - Attachment is obsolete: true
Attachment #8587032 - Flags: review+
Attachment #8587032 - Attachment description: p1_ICE_TCP.patch → Part 5: Add support for TCP candiates
Attached patch Part 5: Add support for TCP candiates (obsolete) — — Splinter Review
un-rot
Attachment #8587032 - Attachment is obsolete: true
Attachment #8596421 - Flags: review+
Attached patch Part 10: ICE TCP user pref (obsolete) — — Splinter Review
Added user pref media.peerconnection.ice.tcp to allow turning off ICE TCP
Attached patch Part 10: ICE TCP user pref — — Splinter Review
Fixed and added ice unit tests.
Attachment #8596423 - Attachment is obsolete: true
Un-bitrot.
Attachment #8596421 - Attachment is obsolete: true
Attachment #8605465 - Flags: review+
Had a successful Hello call between Chrome and Fx with ICE TCP in use (note: for unknown reasons apprtc seems to drop ICE TCP candidates).
backlog: --- → webRTC+
Rank: 25
Whiteboard: [WebRTC][ft:webrtc]
Attachment #8587033 - Flags: review?(docfaraday)
Attachment #8597087 - Flags: review?(docfaraday)
Comment on attachment 8587033 [details] [diff] [review]
Part 9: ignore connection failures during gathering

Review of attachment 8587033 [details] [diff] [review]:
-----------------------------------------------------------------

A couple of minor things.

::: media/mtransport/test/ice_unittest.cpp
@@ +61,3 @@
>  const std::string kDefaultStunServerHostname(
> +    (char *)"stun.stunprotocol.org");
> +//    (char *)"stun.services.mozilla.com");

We either need to revert this, or turn it into a single test-case. I prefer the latter.

::: media/mtransport/third_party/nICEr/src/net/nr_socket_multi_tcp.c
@@ +187,1 @@
>          r_log(LOG_ICE,LOG_DEBUG,"%s:%d function %s connect to STUN server(addr:%s) failed with error %d",__FILE__,__LINE__,__FUNCTION__,stun_server_addr.as_string,_status);

Let's make this a warning.
Attachment #8587033 - Flags: review?(docfaraday)
Comment on attachment 8597087 [details] [diff] [review]
Part 10: ICE TCP user pref

Review of attachment 8597087 [details] [diff] [review]:
-----------------------------------------------------------------

Mostly minor stuff, although there was a test-case that looked weird to me.

::: media/mtransport/nricectx.cpp
@@ +377,5 @@
>  RefPtr<NrIceCtx> NrIceCtx::Create(const std::string& name,
>                                    bool offerer,
>                                    bool set_interface_priorities,
> +                                  bool allow_loopback,
> +                                  bool tcp_enabled) {

Heads-up, this will rot against the IPV6 patch. Just so you know.

::: media/mtransport/test/ice_unittest.cpp
@@ +65,4 @@
>  //const std::string kDefaultStunServerAddress((char *)"23.21.150.121");
>  const std::string kDefaultStunServerHostname(
> +//    (char *)"stun.stunprotocol.org");
> +    (char *)"naboo.rfc3261.net");

More stuff that needs to be cleaned up.

@@ +255,2 @@
>        name_(name),
> +      ice_ctx_(NrIceCtx::Create(name, offerer, set_priorities, allow_loopback, enable_tcp)),

Wrap to 80

@@ +1205,3 @@
>      if (!initted_) {
> +      p1_ = new IceTestPeer("P1", true, set_priorities, allow_loopback, enable_tcp);
> +      p2_ = new IceTestPeer("P2", false, set_priorities, allow_loopback, enable_tcp);

Wrap to 80

@@ +1784,5 @@
> +  SetCandidateFilter(IsTcpCandidate);
> +  SetExpectedTypes(NrIceCandidate::Type::ICE_HOST,
> +    NrIceCandidate::Type::ICE_HOST, kNrIceTransportTcp);
> +  ASSERT_TRUE_WAIT(p1_->ready_ct() == 0 && p2_->ready_ct() == 0,
> +                   kDefaultTimeout);

Doesn't ready_ct() start out at 0?

::: media/mtransport/third_party/nICEr/src/ice/ice_component.c
@@ +421,5 @@
>        }
>  
> +      if ((r=NR_reg_get_char(NR_ICE_REG_ICE_TCP_ENABLE, &tcp_enabled))) {
> +        if (r == R_NOT_FOUND)
> +          tcp_enabled = 1;

I feel like this (and the code in NrIceCtx) would read better if the setting was NR_ICE_REG_ICE_TCP_DISABLE.

@@ +427,4 @@
>            ABORT(r);
>        }
>  
> +      if (tcp_enabled) {

Can you push this patchset to reviewboard? Maybe it will do a better job of displaying the diff.

::: modules/libpref/init/all.js
@@ -375,4 @@
>  pref("media.peerconnection.use_document_iceservers", true);
>  pref("media.peerconnection.identity.enabled", true);
>  pref("media.peerconnection.identity.timeout", 10000);
> -pref("media.peerconnection.ice.loopback", false); // Set only for testing in offline environments.

You noticed this dupe too.
Attachment #8597087 - Flags: review?(docfaraday)
Bug 891551 - Part 5: Add support for TCP ICE candidates. r=bwc
Bug 891551 - Part 6: Add log dumper to multi_tcp_socket_unittest
Bug 891551 - Part 7: Fix some intermittent failures in multi_tcp_socket_unittest. r=ekr
Bug 891551 - Part 8: Increase and make backlog value configurable. r=bwc
Bug 891551 - Part 9: ignore initial socket connect failures during gathering
Attachment #8615498 - Flags: review?(docfaraday)
Bug 891551 - Part 10:added user pref to turn of ICE TCP
Attachment #8615499 - Flags: review?(docfaraday)
Attachment #8615498 - Flags: review?(docfaraday)
Comment on attachment 8615498 [details]
MozReview Request: Bug 891551 - Part 9: ignore initial socket connect failures during gathering. r=jesup

https://reviewboard.mozilla.org/r/10269/#review9003

::: media/mtransport/test/ice_unittest.cpp:63
(Diff revision 1)
> -const std::string kDefaultStunServerAddress((char *)"23.21.150.121");
> +const std::string kDefaultStunServerAddress((char *)"107.23.150.92");
> +//const std::string kDefaultStunServerAddress((char *)"23.21.150.121");
>  const std::string kDefaultStunServerHostname(
> -    (char *)"stun.services.mozilla.com");
> +    (char *)"stun.stunprotocol.org");
> +//    (char *)"stun.services.mozilla.com");

This needs to be cleaned up.

::: media/mtransport/third_party/nICEr/src/net/nr_socket_multi_tcp.c:187
(Diff revision 1)
> +        r_log(LOG_ICE,LOG_DEBUG,"%s:%d function %s connect to STUN server(addr:%s) failed with error %d",__FILE__,__LINE__,__FUNCTION__,stun_server_addr.as_string,_status);

I'd make this a warning.

::: media/mtransport/third_party/nICEr/src/net/nr_socket_multi_tcp.c:186
(Diff revision 1)
> +      } else {
> +        r_log(LOG_ICE,LOG_DEBUG,"%s:%d function %s connect to STUN server(addr:%s) failed with error %d",__FILE__,__LINE__,__FUNCTION__,stun_server_addr.as_string,_status);
> +      }

So if we fail here, we have a tcp_socket_ctx in sock->sockets that has not been initialized with nr_tcp_socket_ctx_initialize; is this going to be ok?
Attachment #8615499 - Flags: review?(docfaraday)
Comment on attachment 8615499 [details]
MozReview Request: Bug 891551 - Part 10: added user pref to turn of ICE TCP. r=mt

https://reviewboard.mozilla.org/r/10271/#review9009

::: media/mtransport/nricectx.cpp:377
(Diff revision 1)
>  RefPtr<NrIceCtx> NrIceCtx::Create(const std::string& name,
>                                    bool offerer,
>                                    bool set_interface_priorities,
> -                                  bool allow_loopback) {
> +                                  bool allow_loopback,
> +                                  bool tcp_enabled) {

Heads-up, this will rot against the IPV6 patch. Just so you know.

::: media/mtransport/test/ice_unittest.cpp:63
(Diff revision 1)
> -const std::string kDefaultStunServerAddress((char *)"107.23.150.92");
> +//const std::string kDefaultStunServerAddress((char *)"52.1.30.94");
> +const std::string kDefaultStunServerAddress((char *)"85.214.41.172");
>  //const std::string kDefaultStunServerAddress((char *)"23.21.150.121");
>  const std::string kDefaultStunServerHostname(
> -    (char *)"stun.stunprotocol.org");
> +//    (char *)"stun.stunprotocol.org");
> +    (char *)"naboo.rfc3261.net");
>  //    (char *)"stun.services.mozilla.com");

Cleanup needed here.

::: media/mtransport/test/ice_unittest.cpp:256
(Diff revision 1)
> -      ice_ctx_(NrIceCtx::Create(name, offerer, set_priorities, allow_loopback)),
> +      ice_ctx_(NrIceCtx::Create(name, offerer, set_priorities, allow_loopback, enable_tcp)),

Wrap to 80

::: media/mtransport/test/ice_unittest.cpp:1206
(Diff revision 1)
> -      p1_ = new IceTestPeer("P1", true, set_priorities, allow_loopback);
> -      p2_ = new IceTestPeer("P2", false, set_priorities, allow_loopback);
> +      p1_ = new IceTestPeer("P1", true, set_priorities, allow_loopback, enable_tcp);
> +      p2_ = new IceTestPeer("P2", false, set_priorities, allow_loopback, enable_tcp);

Wrap to 80

::: media/mtransport/test/ice_unittest.cpp:1787
(Diff revision 1)
> +  ASSERT_TRUE_WAIT(p1_->ready_ct() == 0 && p2_->ready_ct() == 0,
> +                   kDefaultTimeout);

Doesn't ready_ct() start out at 0?

::: media/mtransport/third_party/nICEr/src/ice/ice_component.c:423
(Diff revision 1)
> +      if ((r=NR_reg_get_char(NR_ICE_REG_ICE_TCP_ENABLE, &tcp_enabled))) {
> +        if (r == R_NOT_FOUND)
> +          tcp_enabled = 1;
> +        else
> +          ABORT(r);
> +      }

I feel like this (and the code in NrIceCtx) would read better if the setting was NR_ICE_REG_ICE_TCP_DISABLE.
I went ahead and copied my comments from the previous review into reviewboard. This diff displayed much better than on splinter, thanks.
(In reply to Byron Campen [:bwc] from comment #200)
> Comment on attachment 8615498 [details]
> MozReview Request: Bug 891551 - Part 9: ignore initial socket connect
> failures during gathering
> 
> https://reviewboard.mozilla.org/r/10269/#review9003
> ::: media/mtransport/third_party/nICEr/src/net/nr_socket_multi_tcp.c:186
> (Diff revision 1)
> > +      } else {
> > +        r_log(LOG_ICE,LOG_DEBUG,"%s:%d function %s connect to STUN server(addr:%s) failed with error %d",__FILE__,__LINE__,__FUNCTION__,stun_server_addr.as_string,_status);
> > +      }
> 
> So if we fail here, we have a tcp_socket_ctx in sock->sockets that has not
> been initialized with nr_tcp_socket_ctx_initialize; is this going to be ok?

nr_tcp_socket_ctx_initialize() adds the remote addr and sets up the FD read callbacks. So we are left with an unconnected socket without any callbacks.
Should I rename nr_tcp_socket_ctx_initialize to something like nr_tcp_socket_ctx_set_callback()?
I could also remove the dead tcp_socket_ctx in this case. What do you think Byron?
Flags: needinfo?(docfaraday)
Removing the dead socket is appealing, but we'd have to be careful not to violate some invariant somewhere else, since the caller is going to think everything is ok. It almost feels like we should be returning an error code that the caller doesn't propagate up.
Flags: needinfo?(docfaraday)
Comment on attachment 8615494 [details]
MozReview Request: Bug 891551 - Part 5: Add support for TCP ICE candidates. r=bwc,jesup

Bug 891551 - Part 5: Add support for TCP ICE candidates. r=bwc
Comment on attachment 8615495 [details]
MozReview Request: Bug 891551 - Part 6: Add log dumper to multi_tcp_socket_unittest. r=bwc

Bug 891551 - Part 6: Add log dumper to multi_tcp_socket_unittest. r=bwc
Attachment #8615495 - Attachment description: MozReview Request: Bug 891551 - Part 6: Add log dumper to multi_tcp_socket_unittest → MozReview Request: Bug 891551 - Part 6: Add log dumper to multi_tcp_socket_unittest. r=bwc
Comment on attachment 8615496 [details]
MozReview Request: Bug 891551 - Part 7: Fix some intermittent failures in multi_tcp_socket_unittest. r=ekr

Bug 891551 - Part 7: Fix some intermittent failures in multi_tcp_socket_unittest. r=ekr
Comment on attachment 8615497 [details]
MozReview Request: Bug 891551 - Part 8: Increase and make backlog value configurable. r=bwc

Bug 891551 - Part 8: Increase and make backlog value configurable. r=bwc
Comment on attachment 8615498 [details]
MozReview Request: Bug 891551 - Part 9: ignore initial socket connect failures during gathering. r=jesup

Bug 891551 - Part 9: ignore initial socket connect failures during gathering
Comment on attachment 8615499 [details]
MozReview Request: Bug 891551 - Part 10: added user pref to turn of ICE TCP. r=mt

Bug 891551 - Part 10: added user pref to turn of ICE TCP
Attachment #8615499 - Attachment description: MozReview Request: Bug 891551 - Part 10:added user pref to turn of ICE TCP → MozReview Request: Bug 891551 - Part 10: added user pref to turn of ICE TCP
Bug 891551 - Part 11: fix a couple of issues from part 5
Attachment #8621463 - Flags: review?(docfaraday)
https://reviewboard.mozilla.org/r/10271/#review9583

> Doesn't ready_ct() start out at 0?

Yes you are right.
The point here is to verify that the tcp disabled option suppresses TCP gathering, but that makes it obviously more a gathering test. I'll move the test.
Comment on attachment 8621463 [details]
MozReview Request: Bug 891551 - Part 11: fix a couple of issues from part 5. r=bwc

https://reviewboard.mozilla.org/r/10981/#review9611

::: media/mtransport/nr_socket_prsock.cpp:556
(Diff revision 1)
> +        r_log(LOG_GENERIC, LOG_WARNING, "Couldn't set Nodelay socket option");

Let's log |status| here.

::: media/mtransport/nr_socket_prsock.cpp:876
(Diff revision 1)
> +    r_log(LOG_GENERIC, LOG_WARNING, "Failed to set Nodelay on accepted socket");

Let's log |status| here.

::: media/mtransport/nr_socket_prsock.cpp:900
(Diff revision 1)
>      delete sock;
> +    if (prfd) {
> +      PR_Close(prfd);
> +    }

If sock->fd_ == prfd, ~NrSocket() will call PR_close(prfd) already. I suggest that you ensure that there is no ABORT between where we create prfd, and where we assign it to sock->fd_ (you just need to move the |nr_praddr_to_transport_addr| call I think)

::: media/mtransport/nr_socket_prsock.cpp:879
(Diff revision 1)
> +  if ((r=nr_socket_create_int(static_cast<void *>(sock), sock->vtbl(), sockp)))
> +    ABORT(r);

Any particular reason you moved this? It seems that we don't do any teardown on |sockp| if we encounter an error later.

::: media/mtransport/test/ice_unittest.cpp:1103
(Diff revision 1)
>                                       uint16_t fake_port) {

Fixup indent.

::: media/mtransport/test/multi_tcp_socket_unittest.cpp:193
(Diff revision 1)
> -    printf("Send %s -> %s\n", addr_from.as_string, addr_to.as_string);
> +    printf("Sending %lu bytes %s -> %s\n", len, addr_from.as_string, addr_to.as_string);

size_t is not guaranteed to be the same size as an unsigned long. Suggest an explicit cast (unsigned long)len.

::: media/mtransport/test/multi_tcp_socket_unittest.cpp:217
(Diff revision 1)
> -    printf("Receive %s <- %s\n", addr_to.as_string, addr_from.as_string);
> +    printf("Receiving %lu bytes %s <- %s\n", expected_len, addr_to.as_string,

Similar printf pedantry here.

::: media/mtransport/test/multi_tcp_socket_unittest.cpp:299
(Diff revision 1)
> -TEST_F(MultiTcpSocketTest,  TestTwoActiveBidirectionalTransmit) {
> -  const char data[] = "TestTwoActiveBidirectionalTransmit";
> +  const char data1[] = "TestTwoActiveBidirectionalTransmit";
> +  const char data2[] = "ReplyToTheFirstSocket";
> +  const char data3[] = "TestMessageFromTheSecondSocket";
> +  const char data4[] = "ThisIsAReplyToTheSecondSocket";

Good idea.

::: media/mtransport/third_party/nICEr/src/ice/ice_component.c:334
(Diff revision 1)
> -        ABORT(r);
> +        ABORT(R_INTERNAL);

Is there a reason we're eating this error code?

::: media/mtransport/third_party/nICEr/src/net/nr_socket_multi_tcp.c:58
(Diff revision 1)
> +    nr_tcp_socket_ctx *sock;
> +
>      if (!objp || !*objp)
>        return(0);
>  
> -    nr_socket_destroy(&(*objp)->inner);
> -    RFREE(*objp);
> -    *objp=NULL;
> +    sock=*objp;
> +    *objp=0;
> +
> +    nr_socket_destroy(&sock->inner);
> +
> +    RFREE(sock);

I'm not sure why you changed this.

::: media/mtransport/third_party/nICEr/src/net/nr_socket_multi_tcp.c:166
(Diff revision 1)
> -    int r, _status;
> +    int r, _status = 0;

Also unsure why this change was necessary.

::: media/mtransport/third_party/nICEr/src/net/nr_socket_multi_tcp.c:457
(Diff revision 1)
> -    int r, _status;
> +    int r, _status = 0;

Pretty sure we don't need this change.

::: media/mtransport/third_party/nICEr/src/net/nr_socket_multi_tcp.c:518
(Diff revision 1)
> +
> +    // rearm
> +    NR_ASYNC_WAIT(s, NR_ASYNC_WAIT_READ, nr_tcp_socket_readable_cb, arg);
>  
>      if (sock->readable_cb)
>        sock->readable_cb(s, how, sock->readable_cb_arg);
> -
> -    NR_ASYNC_WAIT(s, NR_ASYNC_WAIT_READ, nr_tcp_socket_readable_cb, arg);

Why did this get switched around? Just consistency?

::: media/mtransport/third_party/nICEr/src/net/nr_socket_multi_tcp.c:554
(Diff revision 1)
> -    /* accept and add to socket_list */
> +    /* accept and attach to STS */

Watch out, this is a third-party library, technically speaking, and STS has no meaning here.

::: media/mtransport/third_party/nICEr/src/stun/nr_socket_buffered_stun.c:285
(Diff revision 1)
> -      int tmp_length;
> +    int tmp_length;

This is only used in the else clause.
Attachment #8621463 - Flags: review?(docfaraday)
https://reviewboard.mozilla.org/r/10981/#review9663

> Is there a reason we're eating this error code?

I believe I read it as this could abort with r=0, but I no longer see how that would be possible.

> I'm not sure why you changed this.

Mostly consistency with other destructors (I assume the pattern here is to first kill the reference before calling another destroy function).

> Also unsure why this change was necessary.

I got compiler warning about _status not being set for the r_log message in line 193.

> Pretty sure we don't need this change.

Fixed compiler warning.

> Why did this get switched around? Just consistency?

Yep consistency with the other existing read callbacks.

> Any particular reason you moved this? It seems that we don't do any teardown on |sockp| if we encounter an error later.

Either something broken is attached to STS, or we return a sockp when returning an error. Both alternatives are not appealing.

In NrSocket::create we similarly attach to STS as the very last step https://dxr.mozilla.org/mozilla-central/source/media/mtransport/nr_socket_prsock.cpp#574
and if anything aborts you are left with a half filled nr_transport_addr.

Also nr_socket_create_int() should only fail with OOM or vtable version mismatch. I'll put a comment about that in the code.
Comment on attachment 8621463 [details]
MozReview Request: Bug 891551 - Part 11: fix a couple of issues from part 5. r=bwc

Bug 891551 - Part 11: fix a couple of issues from part 5
Attachment #8621463 - Flags: review?(docfaraday)
https://reviewboard.mozilla.org/r/10981/#review9675

> Fixed compiler warning.

Similar bad r_log on line 481 is causing this one. Log r instead, and we should be good.

> I got compiler warning about _status not being set for the r_log message in line 193.

Whoa, hold on. That r_log before the abort: label should not be logging status_, it should be logging r since that's where the error is set.
Comment on attachment 8621463 [details]
MozReview Request: Bug 891551 - Part 11: fix a couple of issues from part 5. r=bwc

https://reviewboard.mozilla.org/r/10981/#review9677

::: media/mtransport/nr_socket_prsock.cpp:884
(Diff revisions 1 - 2)
> +  // This should fail with OOM

Should fail only with OOM
Attachment #8621463 - Flags: review?(docfaraday) → review+
Comment on attachment 8621463 [details]
MozReview Request: Bug 891551 - Part 11: fix a couple of issues from part 5. r=bwc

Bug 891551 - Part 11: fix a couple of issues from part 5. r=bwc
Attachment #8621463 - Attachment description: MozReview Request: Bug 891551 - Part 11: fix a couple of issues from part 5 → MozReview Request: Bug 891551 - Part 11: fix a couple of issues from part 5. r=bwc
Attachment #8621463 - Flags: review+
Attachment #8615499 - Flags: review?(martin.thomson)
Comment on attachment 8615499 [details]
MozReview Request: Bug 891551 - Part 10: added user pref to turn of ICE TCP. r=mt

Bug 891551 - Part 10: added user pref to turn of ICE TCP
Comment on attachment 8621463 [details]
MozReview Request: Bug 891551 - Part 11: fix a couple of issues from part 5. r=bwc

Bug 891551 - Part 11: fix a couple of issues from part 5. r=bwc
https://reviewboard.mozilla.org/r/10271/#review9695

> Yes you are right.
> The point here is to verify that the tcp disabled option suppresses TCP gathering, but that makes it obviously more a gathering test. I'll move the test.

Removed the connection test and added a gathering test instead.
Attachment #8615499 - Flags: review?(martin.thomson) → review+
Comment on attachment 8615499 [details]
MozReview Request: Bug 891551 - Part 10: added user pref to turn of ICE TCP. r=mt

https://reviewboard.mozilla.org/r/10271/#review9697

::: media/mtransport/nricectx.cpp:470
(Diff revision 3)
> +    NR_reg_set_char((char *)NR_ICE_REG_ICE_TCP_DISABLE, !tcp_enabled);

Does this configuration item have to be a negative like this?  Or do you need to have a particular default?

::: media/mtransport/test/ice_unittest.cpp:65
(Diff revision 3)
> -    (char *)"stun.stunprotocol.org");
> +    (char *)"global.stun.twilio.com");

Um...

::: media/mtransport/test/ice_unittest.cpp:1203
(Diff revision 3)
> -      p1_ = new IceTestPeer("P1", true, set_priorities, allow_loopback);
> -      p2_ = new IceTestPeer("P2", false, set_priorities, allow_loopback);
> +      p1_ = new IceTestPeer("P1", true, set_priorities, allow_loopback,
> +        enable_tcp);
> +      p2_ = new IceTestPeer("P2", false, set_priorities, allow_loopback,
> +        enable_tcp);

Fix indent.

::: media/mtransport/test/ice_unittest.cpp:252
(Diff revision 3)
> -      ice_ctx_(NrIceCtx::Create(name, offerer, set_priorities, allow_loopback)),
> +      ice_ctx_(NrIceCtx::Create(name, offerer, set_priorities, allow_loopback,
> +        enable_tcp)),

Fix indent.

::: media/mtransport/third_party/nICEr/src/ice/ice_component.c:423
(Diff revision 3)
> +      if ((r=NR_reg_get_char(NR_ICE_REG_ICE_TCP_DISABLE, &ice_tcp_disabled))) {
> +        if (r != R_NOT_FOUND)
> +          ABORT(r);
> +      }

Move this out of the loop please.
Comment on attachment 8615494 [details]
MozReview Request: Bug 891551 - Part 5: Add support for TCP ICE candidates. r=bwc,jesup

Bug 891551 - Part 5: Add support for TCP ICE candidates. r=bwc
Comment on attachment 8615495 [details]
MozReview Request: Bug 891551 - Part 6: Add log dumper to multi_tcp_socket_unittest. r=bwc

Bug 891551 - Part 6: Add log dumper to multi_tcp_socket_unittest. r=bwc
Comment on attachment 8615496 [details]
MozReview Request: Bug 891551 - Part 7: Fix some intermittent failures in multi_tcp_socket_unittest. r=ekr

Bug 891551 - Part 7: Fix some intermittent failures in multi_tcp_socket_unittest. r=ekr
Comment on attachment 8615497 [details]
MozReview Request: Bug 891551 - Part 8: Increase and make backlog value configurable. r=bwc

Bug 891551 - Part 8: Increase and make backlog value configurable. r=bwc
Comment on attachment 8615498 [details]
MozReview Request: Bug 891551 - Part 9: ignore initial socket connect failures during gathering. r=jesup

Bug 891551 - Part 9: ignore initial socket connect failures during gathering
Comment on attachment 8615499 [details]
MozReview Request: Bug 891551 - Part 10: added user pref to turn of ICE TCP. r=mt

Bug 891551 - Part 10: added user pref to turn of ICE TCP
Attachment #8615499 - Flags: review+
Comment on attachment 8621463 [details]
MozReview Request: Bug 891551 - Part 11: fix a couple of issues from part 5. r=bwc

Bug 891551 - Part 11: fix a couple of issues from part 5. r=bwc
Attachment #8615494 - Flags: review?(rjesup)
Comment on attachment 8615494 [details]
MozReview Request: Bug 891551 - Part 5: Add support for TCP ICE candidates. r=bwc,jesup

Bug 891551 - Part 5: Add support for TCP ICE candidates. r=bwc
Comment on attachment 8615495 [details]
MozReview Request: Bug 891551 - Part 6: Add log dumper to multi_tcp_socket_unittest. r=bwc

Bug 891551 - Part 6: Add log dumper to multi_tcp_socket_unittest. r=bwc
Comment on attachment 8615496 [details]
MozReview Request: Bug 891551 - Part 7: Fix some intermittent failures in multi_tcp_socket_unittest. r=ekr

Bug 891551 - Part 7: Fix some intermittent failures in multi_tcp_socket_unittest. r=ekr
Comment on attachment 8615497 [details]
MozReview Request: Bug 891551 - Part 8: Increase and make backlog value configurable. r=bwc

Bug 891551 - Part 8: Increase and make backlog value configurable. r=bwc
Attachment #8615498 - Flags: review?(rjesup)
Comment on attachment 8615498 [details]
MozReview Request: Bug 891551 - Part 9: ignore initial socket connect failures during gathering. r=jesup

Bug 891551 - Part 9: ignore initial socket connect failures during gathering
Comment on attachment 8615499 [details]
MozReview Request: Bug 891551 - Part 10: added user pref to turn of ICE TCP. r=mt

Bug 891551 - Part 10: added user pref to turn of ICE TCP
Attachment #8615499 - Flags: review?(rjesup)
Comment on attachment 8621463 [details]
MozReview Request: Bug 891551 - Part 11: fix a couple of issues from part 5. r=bwc

Bug 891551 - Part 11: fix a couple of issues from part 5. r=bwc
https://reviewboard.mozilla.org/r/10271/#review10029

> Um...

Agreed. We should replace our external dependencies at some point.
I put a comment with a reference to bug 860775 here.
Comment on attachment 8615499 [details]
MozReview Request: Bug 891551 - Part 10: added user pref to turn of ICE TCP. r=mt

Bug 891551 - Part 10: added user pref to turn of ICE TCP. r=mt
Attachment #8615499 - Attachment description: MozReview Request: Bug 891551 - Part 10: added user pref to turn of ICE TCP → MozReview Request: Bug 891551 - Part 10: added user pref to turn of ICE TCP. r=mt
Attachment #8615499 - Flags: review?(rjesup)
Comment on attachment 8621463 [details]
MozReview Request: Bug 891551 - Part 11: fix a couple of issues from part 5. r=bwc

Bug 891551 - Part 11: fix a couple of issues from part 5. r=bwc
To capture what has been agreed on in discussions outside of Bugzilla:
- Fuzzing efforts are started now and will continue independent of this bug
- The code has been reviewed several times by multiple people and therefore we think a special security review is not needed at this time.
Flags: sec-review?(cdiehl)
Target Milestone: mozilla34 → mozilla41
@Randel don't review the whole part 5. It has been reviewed by bwc and me. But the changes I had to make to unrot part 5 after the landing of the NAT simulator are that big that it would help if you could review the interdiff: https://reviewboard.mozilla.org/r/10261/diff/2-4/

Similarly part 9 has been reviewed by bwc already. So I'm mainly looking for someone to review the interdiff https://reviewboard.mozilla.org/r/10269/diff/2-4/ and give me a thumbs up for the fixes requested by bwc.
https://reviewboard.mozilla.org/r/10269/#review10059

::: media/mtransport/third_party/nICEr/src/net/nr_socket_multi_tcp.c:233
(Diff revisions 2 - 4)
> +              nr_tcp_socket_ctx_destroy(&tcp_socket_ctx);

Please move this to the abort clause in nr_socket_multi_tcp_create_stun_server_socket.

Otherwise that function leaks on failure.  I mean, technically, you could require callers to free the tcp_socket_ctx when they get an error, like this, but it's a crappy contract.
https://reviewboard.mozilla.org/r/10261/#review10061

::: media/mtransport/test/stunserver.cpp:194
(Diff revisions 2 - 4)
> +nsRefPtr<NrIceCtx> ice_ctx_ = NULL;

Please don't create globals.  They screw things up.  In this case, it looks like you are treating this as a member variable, so make it one.

::: media/mtransport/test/stunserver.cpp:517
(Diff revisions 2 - 4)
>       return R_ALREADY;

Note: this leaks.  See other comment.

::: media/mtransport/third_party/nICEr/src/ice/ice_component.c:343
(Diff revisions 2 - 4)
> -      r=nr_socket_multi_tcp_create(ctx->stun_servers,ctx->stun_server_ct,
> +      r=nr_socket_multi_tcp_create(ctx,&addr,tcp_type,so_sock_ct,1,NR_STUN_MAX_MESSAGE_SIZE,&nrsock);

Leaks.
Comment on attachment 8615494 [details]
MozReview Request: Bug 891551 - Part 5: Add support for TCP ICE candidates. r=bwc,jesup

Reviewboard!!!! (channeling Khan)

ok, marking review by hand for part 5 (2-4):

media/mtransport/nr_socket_prsock.cpp:1440
+ int nr_socket_local_create(void *obj, nr_transport_addr *addr, nr_socket **sockp) {

obj appears to be unused... is this correct? is it required by some aspect of the vtbl stuff?  Please comment what it is and why it's unused if this is intentional

media/mtransport/test/multi_tcp_socket_unittest.cpp:78
+      r = nr_transport_addr_get_addrstring(&stun_addr, &stun_host[0], 1000);

sizeof(stun_host) instead of 1000
Attachment #8615494 - Flags: review?(rjesup) → review+
Comment on attachment 8615498 [details]
MozReview Request: Bug 891551 - Part 9: ignore initial socket connect failures during gathering. r=jesup

nits:
media/mtransport/third_party/nICEr/src/net/nr_socket_multi_tcp.c:231
+            if (r != R_BAD_ARGS) {

nit: style for this file (unfortunately) seems to be no spaces around comparison operators

media/mtransport/third_party/nICEr/src/net/nr_socket_multi_tcp.c:245
+            if (r != R_BAD_ARGS) {
Ditto
Attachment #8615498 - Flags: review?(rjesup) → review+
https://reviewboard.mozilla.org/r/10269/#review10069

> Please move this to the abort clause in nr_socket_multi_tcp_create_stun_server_socket.
> 
> Otherwise that function leaks on failure.  I mean, technically, you could require callers to free the tcp_socket_ctx when they get an error, like this, but it's a crappy contract.

Apparently having bwc and you as reviewers results in lots of contradictions :-)

I'll move the error handling into nr_socket_multi_tcp_create_stun_server_socket as I'm not a fan of the error handling code outside.
(In reply to Randell Jesup [:jesup] from comment #244)
> Reviewboard!!!! (channeling Khan)

:-D
 
> media/mtransport/nr_socket_prsock.cpp:1440
> + int nr_socket_local_create(void *obj, nr_transport_addr *addr, nr_socket
> **sockp) {
> 
> obj appears to be unused... is this correct? is it required by some aspect
> of the vtbl stuff?  Please comment what it is and why it's unused if this is
> intentional

I only inherited the *obj stuff from bwc's NAT simulator landing. In this function it is unused, but the alternative implementation from the NAT simulator which gets called in the ice_unittest via the vtbl https://dxr.mozilla.org/mozilla-central/source/media/mtransport/test_nr_socket.cpp#99
actually uses it.
https://reviewboard.mozilla.org/r/10261/#review10071

> Please don't create globals.  They screw things up.  In this case, it looks like you are treating this as a member variable, so make it one.

Ups that was not intended. Thanks for catching it.

> Note: this leaks.  See other comment.

Are you concerned that nr_socket_multi_tcp_create() leaks when it fails?
I don't see that happening as nr_socket_multi_tcp_create calls nr_socket_multi_tcp_destroy in case of failure.
Attachment #8615494 - Attachment description: MozReview Request: Bug 891551 - Part 5: Add support for TCP ICE candidates. r=bwc → MozReview Request: Bug 891551 - Part 5: Add support for TCP ICE candidates. r=bwc,jesup
Attachment #8615494 - Flags: review+ → review?(rjesup)
Comment on attachment 8615494 [details]
MozReview Request: Bug 891551 - Part 5: Add support for TCP ICE candidates. r=bwc,jesup

Bug 891551 - Part 5: Add support for TCP ICE candidates. r=bwc,jesup
Comment on attachment 8615495 [details]
MozReview Request: Bug 891551 - Part 6: Add log dumper to multi_tcp_socket_unittest. r=bwc

Bug 891551 - Part 6: Add log dumper to multi_tcp_socket_unittest. r=bwc
Comment on attachment 8615496 [details]
MozReview Request: Bug 891551 - Part 7: Fix some intermittent failures in multi_tcp_socket_unittest. r=ekr

Bug 891551 - Part 7: Fix some intermittent failures in multi_tcp_socket_unittest. r=ekr
Comment on attachment 8615497 [details]
MozReview Request: Bug 891551 - Part 8: Increase and make backlog value configurable. r=bwc

Bug 891551 - Part 8: Increase and make backlog value configurable. r=bwc
Attachment #8615498 - Attachment description: MozReview Request: Bug 891551 - Part 9: ignore initial socket connect failures during gathering → MozReview Request: Bug 891551 - Part 9: ignore initial socket connect failures during gathering. r=jesup
Attachment #8615498 - Flags: review+ → review?(rjesup)
Comment on attachment 8615498 [details]
MozReview Request: Bug 891551 - Part 9: ignore initial socket connect failures during gathering. r=jesup

Bug 891551 - Part 9: ignore initial socket connect failures during gathering. r=jesup
Comment on attachment 8615499 [details]
MozReview Request: Bug 891551 - Part 10: added user pref to turn of ICE TCP. r=mt

Bug 891551 - Part 10: added user pref to turn of ICE TCP. r=mt
Comment on attachment 8621463 [details]
MozReview Request: Bug 891551 - Part 11: fix a couple of issues from part 5. r=bwc

Bug 891551 - Part 11: fix a couple of issues from part 5. r=bwc
https://reviewboard.mozilla.org/r/10261/#review10163

> Leaks.

Just for record: this was leaking in the code which was up for review. But as I changed the behavior of nr_socket_multi_tcp_create_stun_server_socket() to clean up in case of connection failure this should not leak any more in revision 5.
Blocks: 1176377
Blocks: 1176382
Blocks: 1176393
Blocks: 1176407
Thanks for merging, Ryan!

Sorry, forgot to remove the "leave-open" keyword.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago9 years ago
Keywords: leave-open
Resolution: --- → FIXED
Attachment #8615498 - Flags: review?(rjesup) → review+
Comment on attachment 8615498 [details]
MozReview Request: Bug 891551 - Part 9: ignore initial socket connect failures during gathering. r=jesup

https://reviewboard.mozilla.org/r/10269/#review10259

::: media/mtransport/third_party/nICEr/src/net/nr_socket_multi_tcp.c:164
(Diff revisions 4 - 5)
> -    nr_tcp_socket_ctx *tcp_socket_ctx;
> +    nr_tcp_socket_ctx *tcp_socket_ctx=0;

NULL please
(In reply to Randell Jesup [:jesup] from comment #260)
> Comment on attachment 8615498 [details]
> MozReview Request: Bug 891551 - Part 9: ignore initial socket connect
> failures during gathering. r=jesup
> 
> https://reviewboard.mozilla.org/r/10269/#review10259
> 
> ::: media/mtransport/third_party/nICEr/src/net/nr_socket_multi_tcp.c:164
> (Diff revisions 4 - 5)
> > -    nr_tcp_socket_ctx *tcp_socket_ctx;
> > +    nr_tcp_socket_ctx *tcp_socket_ctx=0;
> 
> NULL please

Negative. Convention in this code is 0.
Comment on attachment 8615494 [details]
MozReview Request: Bug 891551 - Part 5: Add support for TCP ICE candidates. r=bwc,jesup

Mozreview wouldn't Ship It....
Attachment #8615494 - Flags: review?(rjesup) → review+
Implementation of this needs to be documented.
Keywords: dev-doc-needed
Blocks: 1050930
I'm told this is preffed on in Firefox 54; I have noted it on Firefox 54 for developers (https://developer.mozilla.org/en-US/Firefox/Releases/54#AudioVideo).

I also added to Experimental features in Firefox (https://developer.mozilla.org/en-US/Firefox/Experimental_features) that this was first implemented in Firefox 41, preffed off.

When I get a link to the blog post I'm told :drno is writing, I'll link to it (unless someone else does first).

That aside, this is now doc complete.
You need to log in before you can comment on or make changes to this bug.