Closed Bug 837919 Opened 11 years ago Closed 11 years ago

WebRTC: RTCPeerConnection constructor: FQDN not yet implemented (only IP-#s).

Categories

(Core :: WebRTC, defect, P3)

21 Branch
x86
macOS
defect

Tracking

()

VERIFIED FIXED
mozilla22

People

(Reporter: tom, Assigned: jib)

References

Details

(Whiteboard: [WebRTC],[blocking-webrtc+])

Attachments

(2 files, 14 obsolete files)

29.41 KB, patch
jib
: review+
Details | Diff | Splinter Review
2.08 KB, text/html
Details
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/537.22 (KHTML, like Gecko) Chrome/25.0.1364.58 Safari/537.22

Steps to reproduce:

Create a new RTCPeerConnection with a FQDN such as stun.l.google.com:

new mozRTCPeerConnection({
    'iceServers': [
      {'url': 'stun:stun.l.google.com:19302' }
    ]
  });


Actual results:

Got the following error:

RTCPeerConnection constructor: FQDN not yet implemented (only IP-#s).


Expected results:

FQDN should be allowed.
Looks like a known bug: http://www.webrtc.org/interop, but I couldn't find a bug tracking it.
Component: Untriaged → WebRTC
Product: Firefox → Core
QA Contact: jsmith
This is likely a dup, but if it isn't it can be the bug for fixing it.
Assignee: nobody → jib
Status: UNCONFIRMED → NEW
Ever confirmed: true
Priority: -- → P3
Whiteboard: [WebRTC],[blocking-webrtc+]
Depends on: 843644
Working patch until ekr updates Bug 843644 with the changes we discussed today. Looking for early feedback.
Attachment #719259 - Flags: feedback?(ekr)
Attachment #719259 - Flags: feedback?(adam)
Something's not shutting down correctly in the browser with the patch I uploaded, though the unit-tests complete fine. I can't figure it out.
Thanks yeah that fixed it.
Attachment #719259 - Attachment is obsolete: true
Attachment #719259 - Flags: feedback?(ekr)
Attachment #719259 - Flags: feedback?(adam)
Comment on attachment 719259 [details] [diff] [review]
RTCPeerConnection constructor supports FQDN STUN servers.

Review of attachment 719259 [details] [diff] [review]:
-----------------------------------------------------------------

::: media/mtransport/nriceresolver.cpp
@@ +143,5 @@
> +abort:
> +  return _status;
> +}
> +
> +static nsIThread *get_current_thread() {

We discussed this on IRC, but just to get the notes in the bug log -- perhaps we should add a comment along the lines of "NS_GetCurrentThread() is only defined for MOZILLA_INTERNAL_API builds"

@@ +179,5 @@
> +    prAddr->ipv6.flowinfo = addr->inet6.flowinfo;
> +    memcpy(&prAddr->ipv6.ip, &addr->inet6.ip, sizeof(addr->inet6.ip.u8));
> +    prAddr->ipv6.scope_id = addr->inet6.scope_id;
> +  }
> +#if defined(XP_UNIX) || defined(XP_OS2)

AF_LOCAL can't happen in the unit tests. I'd remove this section altogether.

::: media/mtransport/test/ice_unittest.cpp
@@ +33,5 @@
>  #define GTEST_HAS_RTTI 0
>  #include "gtest/gtest.h"
>  #include "gtest_utils.h"
>  
>  using namespace mozilla;

You'll need to replace the DNS name in kDefaultStunServerHostname to be a valid entry, pointing to 23.21.150.121. Ideally, we could talk the infrastructure guys into adding something like "stun.mozilla.org" to the DNS zone.

In the meanwhile, you can temporarily use the address "stun.roach.at," which I've set up to point to that IP address.

::: media/webrtc/signaling/src/peerconnection/PeerConnectionMedia.h
@@ +36,5 @@
>  }
>  #endif
>  
>  #include "nricectx.h"
> +#include "nriceresolver.h"

Since we're just using a pointer in this file, you can trim dependencies down (i.e., speed build times) by declaring "class NrIceResolver" inside the Mozilla namespace here. See DataChannel, above.
Attachment #719259 - Attachment is obsolete: false
Attachment #719259 - Attachment is obsolete: true
Attachment #719683 - Attachment is patch: true
Comment on attachment 719683 [details] [diff] [review]
RTCPeerConnection constructor supports FQDN STUN servers.

Review of attachment 719683 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good to me, modulo the fact that I *think* we're leaking PendingResolution objects.

::: media/mtransport/nriceresolver.cpp
@@ +135,5 @@
> +#endif
> +}
> +
> +nsresult NrIceResolver::PendingResolution::sendOffTo(nsIDNSService *dns) {
> +  return dns->AsyncResolve(nsAutoCString (hostname_.c_str()), 0, this,

Until the nICEr library supports IPv6, I think you'll want to pass in the "RESOLVE_DISABLE_IPV6" flag here.

@@ +141,5 @@
> +}
> +
> +using namespace mozilla::net;
> +
> +static void myNetAddrToPRNetAddr(const NetAddr *addr, PRNetAddr *prAddr) {

I think I would eliminate this function in favor of a new function (added to nr_socket_prsocket.cpp) that goes directly from a NetAddr to a tranport addr. E.g. http://www.pastebin.mozilla.org/2184695

@@ +189,5 @@
> +      }
> +    }
> +  }
> +  cb_(cb_arg_, nullptr);
> +  return NS_OK;

Don't we need to deallocate the PendingResolution objects here? If not, where do they get cleaned up?

::: media/mtransport/test/ice_unittest.cpp
@@ +406,2 @@
>  
>  TEST_F(IceGatherTest, TestGatherStunServerIpAddress) {

I would change this test's name to contain "Fake" -- and do the same for the next two tests as well.
Attachment #719683 - Flags: feedback+
> Don't we need to deallocate the PendingResolution objects here? If not, where
> do they get cleaned up?

Being derived from nsIDNSListener, they get cleaned up once AsyncResolve is done with them, which I would guess is after the callback completes. They get cleaned up on the main thread.

It's probably cleaner to (also) hold onto them ourselves in NrIceResolver until nr_resolver_destroy() is called, to have explicit guarantees.
> It's probably cleaner to (also) hold onto them ourselves in NrIceResolver until
> nr_resolver_destroy() is called, to have explicit guarantees.

Strike that. Destroy is for cleaning up the resolver, not the pending objects (of which there will be several), I got confused there for a second. Pending objects should be cleaned up in the callback and close. Close cancels the async-request, so I think we're fine. Pretty nice.
> Since we're just using a pointer in this file, you can trim dependencies down
> (i.e., speed build times) by declaring "class NrIceResolver" inside the Mozilla
> namespace here. See DataChannel, above.

mDNSResolver is embedded, not a pointer.
Updated to work with ekr's latest patch.
- Incorporated feedback.
- Solved proper PendingResolution destruction using a handstand.
Attachment #719683 - Attachment is obsolete: true
Attachment #719859 - Flags: review?(adam)
Comment on attachment 719859 [details] [diff] [review]
RTCPeerConnection constructor supports FQDN STUN servers.

Review of attachment 719859 [details] [diff] [review]:
-----------------------------------------------------------------

::: media/mtransport/nriceresolver.cpp
@@ +139,5 @@
> +}
> +
> +nsresult NrIceResolver::PendingResolution::sendOffTo(nsIDNSService *dns) {
> +  // Add a reference to ourselves (don't rely on AsyncResolve alone) for cancel.
> +  handstand_ = this;

Why not just do AddRef() and then do Release() below.
Why use functions instead of an object? Why use cash when we could all be counting how much we owe each other? I find it harder to deduce/lock down what's going on in code with manual AddRef()/Release() in them, because of their weak guarantees (what's the scope of the extra handle? When is it traded? How many handles can be outstanding?)

Why not use the same stricter pattern as elsewhere which doesn't require implementation knowledge? Easier to see that the extra handle is private, never shared, and only one can ever be outstanding (and you can look at it in the debugger).
(In reply to Jan-Ivar Bruaroey [:jib] from comment #16)
> Why use functions instead of an object? Why use cash when we could all be
> counting how much we owe each other? I find it harder to deduce/lock down
> what's going on in code with manual AddRef()/Release() in them, because of
> their weak guarantees (what's the scope of the extra handle? When is it
> traded? How many handles can be outstanding?)
>
> Why not use the same stricter pattern as elsewhere which doesn't require
> implementation knowledge? Easier to see that the extra handle is private,
> never shared, and only one can ever be outstanding (and you can look at it
> in the debugger).

It's your code, but I don't think what you are saying here is actually correct.

1. This is an nsRefPtr, so you are actually sharing ownership.
2. AddRef and Release make absolutely clear what is going on. I.e., you are temporarily creating a new reference.
3. This depends *deeply* on the implementation, namely that there is no cycle garbage collection.
Fixes try failure and incorporates feedback: 
- Removed unused variable in opt build.
- Avoids releasing twice in a single-thread-race where cancel is called with an
  OnLookupComplete already posted on the event queue (not actually observed,
  but seems possible).
- Uses manual ref-counting with c API to avoid cyclic dependency.
Attachment #719859 - Attachment is obsolete: true
Attachment #719859 - Flags: review?(adam)
Attachment #720282 - Flags: review?(ekr)
Attachment #720282 - Flags: review?(adam)
Something going wrong on windows. Anyone know what's different there?

nriceresolver.cpp

e:\builds\moz2_slave\try-w32-0000000000000000000000\build\media\mtransport\third_party\nrappkit\src\util\libekr\r_types.h(184) : error C2371: 'INT8' : redefinition; different basic types

        c:\Program Files (x86)\Windows Kits\8.0\include\shared\basetsd.h(70) : see declaration of 'INT8'

e:\builds\moz2_slave\try-w32-0000000000000000000000\build\media\mtransport\third_party\nrappkit\src\util\libekr\r_types.h(204) : error C2371: 'UINT8' : redefinition; different basic types

        c:\Program Files (x86)\Windows Kits\8.0\include\shared\basetsd.h(74) : see declaration of 'UINT8'
Compiles on Windows.
Attachment #720282 - Attachment is obsolete: true
Attachment #720282 - Flags: review?(ekr)
Attachment #720282 - Flags: review?(adam)
Attachment #720763 - Flags: review?(ekr)
Updated.
- Compiles on fedora.
- Still has crashtest fail. Any help appreciated.
  nr_ice_ctx_destroy_cb (s=0x0, how=0, cb_arg=0x11e37c86c) at ice_ctx.c:382
  http://www.pastebin.mozilla.org/2195218
Attachment #720763 - Attachment is obsolete: true
Attachment #720763 - Flags: review?(ekr)
Comment on attachment 721006 [details] [diff] [review]
RTCPeerConnection constructor supports FQDN STUN servers.

Now that I think about it, running crashtests on Bug 843644 alone wont exercise   any new code, since except for the unit-tests, things don't get hooked up until this patch here.

So the crashing problem may still lie in the patch in Bug 843644. Submitting for review. Everything else should be gtg, so feel free to checkin for me if you find it.
Attachment #721006 - Flags: review?(ekr)
Comment on attachment 721006 [details] [diff] [review]
RTCPeerConnection constructor supports FQDN STUN servers.

Review of attachment 721006 [details] [diff] [review]:
-----------------------------------------------------------------

::: media/mtransport/nriceresolver.cpp
@@ +73,5 @@
> +  vtbl_->cancel = &NrIceResolver::cancel;
> +}
> +
> +NrIceResolver::~NrIceResolver() {
> +  delete vtbl_;

Here is why you are getting crashes. nricectx needs to hold a strong reference to the
nr_resolver * because the ice_ctx is freed asynchronously with respect to the
PCMedia (and indeed with respect to nricectx). What's happening is that you
are freeing the memory associated with the vtbl which causes the subsquent
nr_resolver_destroy() to do a FMR.

I suggest that what you do is have AllocateResolver do AddRef() and
::destroy() do Release(). That should keep this alive properly.

You'll note that the ASSERT() in ~NrIceResolverFake mostly endorses
the lifetime invariant, though perhaps not as strongly as one might like.
Debugger trace:
Breakpoint 1, mozilla::NrIceResolver::~NrIceResolver (this=0x10e90dc88) at nriceresolver.cpp:77
77	  delete vtbl_;
$7 = ('mozilla::NrIceResolver' *) 0x10e90dc88
352399360[11bd723b0]: cpr SIPCC-CC_API: 5/5, cc_int_onhook: UI -> GSM: ONHOOK              
353275904[10c5b4840]: cpr SIPCC-DCSM: dcsm_process_event: DCSM 11  :(DCSM_READY:ONHOOK )
353275904[10c5b4840]: cpr SIPCC-GSM: 5/5, sm_process_event: DEF   :(IDLE:ONHOOK )

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: 13 at address: 0x0000000000000000
[Switching to process 76181 thread 0x3507]
0x0000000104efb1cc in nr_resolver_destroy (resolverp=0x10e90eb54) at nr_resolver.c:66
66	  resolver->vtbl->destroy(&resolver->obj);
(gdb) p resolver->obj
$8 = (void *) 0x10e90dc88
Current language:  auto; currently minimal
(gdb) kill
Kill the program being debugged? (y or n) y
error while killing target (killing anyway): assertion failure on line 219 of "/SourceCache/gdb/gdb-1820/src/gdb/macosx/macosx-nat-inferior-util.c" in function "kern_return_t macosx_inferior_suspend_mach(macosx_inferior_status *)": macosx_task_valid (s->task)

warning: error on line 2184 of "/SourceCache/gdb/gdb-1820/src/gdb/macosx/macosx-nat-inferior.c" in function "void macosx_kill_inferior_safe()": (os/kern) invalid argument (0x4x)
(gdb)
Comment on attachment 721006 [details] [diff] [review]
RTCPeerConnection constructor supports FQDN STUN servers.

Review of attachment 721006 [details] [diff] [review]:
-----------------------------------------------------------------

OK, unforrtunately if you want this AddRef thing to work, you will need to hold a smart pointer, not
just have the object in PCMedia. that's not going to work at all.
Incorporates feedback.
- Properly refcounted. Fixes crashtest.
Attachment #721006 - Attachment is obsolete: true
Attachment #721006 - Flags: review?(ekr)
Attachment #721055 - Flags: review?(ekr)
Attachment #721055 - Flags: checkin?(ekr)
Wait, signaling_unittests assertion. Sorry.
###!!! ASSERTION: Using observer service off the main thread!: 'Error', file /Users/Jan/moz/mozilla-central/xpcom/ds/nsObserverService.cpp, line 103
Attachment #721055 - Flags: review?(ekr)
Attachment #721055 - Flags: checkin?(ekr)
Updated. Verified all tests pass locally.
- Unbroke signaling_unittests.
- Slightly less overhead added to ice_unittest.

https://tbpl.mozilla.org/?tree=Try&rev=e644162aa856
Attachment #721055 - Attachment is obsolete: true
Attachment #721088 - Flags: review?(ekr)
Comment on attachment 721088 [details] [diff] [review]
RTCPeerConnection constructor supports FQDN STUN servers.

Review of attachment 721088 [details] [diff] [review]:
-----------------------------------------------------------------

This looks basically sound, but I'm r-ing b/c there are a lot of minor comments. and I'd like to
see it again.

Two systematic comments:
1. Please add thread asserts throughout to ensure that all use of this is on the STS
thread.

2. Please add logging for each resolution and failed resolution. Use MOZ_MTLOG.

::: media/mtransport/nr_socket_prsock.cpp
@@ +251,5 @@
> +      case AF_INET:
> +        ip4.sin_family = PF_INET;
> +        ip4.sin_addr.s_addr = netaddr->inet.ip;
> +        ip4.sin_port = netaddr->inet.port;
> +        if ((r = nr_sockaddr_to_transport_addr((sockaddr *)&ip4,

Suggest using nr_ip4_port_to_transport_addr.

@@ +254,5 @@
> +        ip4.sin_port = netaddr->inet.port;
> +        if ((r = nr_sockaddr_to_transport_addr((sockaddr *)&ip4,
> +                                              sizeof(ip4),
> +                                              IPPROTO_UDP, 1,
> +              addr)))

Indent.

::: media/mtransport/nriceresolver.cpp
@@ +52,5 @@
> +#include "nr_resolver.h"
> +#include "transport_addr.h"
> +}
> +
> +#include "mozilla/net/DNS.h"

Mozilla includes before nICEr includes please.

@@ +66,5 @@
> +
> +namespace mozilla {
> +
> +NrIceResolver::NrIceResolver() :
> +    vtbl_(new nr_resolver_vtbl),

new nr_resolver_vtbl()

@@ +67,5 @@
> +namespace mozilla {
> +
> +NrIceResolver::NrIceResolver() :
> +    vtbl_(new nr_resolver_vtbl),
> +    allocated_resolvers_(0) {

Is there a reason to allow more than one instance of a resolver (the way that nriceresolverfake) does? Are you just
futureproofing?

@@ +79,5 @@
> +  delete vtbl_;
> +}
> +
> +nsresult NrIceResolver::Init() {
> +  nsresult rv;

Whitespace after declarations.

@@ +80,5 @@
> +}
> +
> +nsresult NrIceResolver::Init() {
> +  nsresult rv;
> +  dns_ = do_GetService(NS_DNSSERVICE_CONTRACTID, &rv);

Log failure to acquire the service.

@@ +85,5 @@
> +  return rv;
> +}
> +
> +nr_resolver *NrIceResolver::AllocateResolver() {
> +  nr_resolver *resolver;

Whitesapace after declarations.

@@ +88,5 @@
> +nr_resolver *NrIceResolver::AllocateResolver() {
> +  nr_resolver *resolver;
> +  int r = nr_resolver_create_int((void *)this, vtbl_, &resolver);
> +  MOZ_ASSERT(!r);
> +  if(r)

Log failure.

@@ +90,5 @@
> +  int r = nr_resolver_create_int((void *)this, vtbl_, &resolver);
> +  MOZ_ASSERT(!r);
> +  if(r)
> +    return nullptr;
> +  AddRef();

Please comment why this is happening.

@@ +96,5 @@
> +  return resolver;
> +}
> +
> +void NrIceResolver::DestroyResolver() {
> +  --allocated_resolvers_;

And comment this too.

@@ +118,5 @@
> +  nsCOMPtr<nsIThread> thread;
> +  nsresult rv = NS_GetCurrentThread(getter_AddRefs(thread));
> +  NS_ENSURE_SUCCESS(rv, nullptr);
> +  return thread.get();
> +#endif

Can we do this without a conditional? I.e., just use the second version all the time.

@@ +122,5 @@
> +#endif
> +}
> +
> +int NrIceResolver::resolve(void *obj,
> +                           nr_resolver_resource *url,

Call this something besides url. "resource" perhaps?

@@ +136,5 @@
> +                                                        cb, cb_arg));
> +  if (NS_FAILED(that->dns_->AsyncResolve(nsAutoCString (url->domain_name),
> +                                         nsIDNSService::RESOLVE_DISABLE_IPV6,
> +                                         pr, my_NS_GetCurrentThread(),
> +                                         getter_AddRefs(pr->request_)))) {

Log here. Also how can this happen? Is it an internal error? Or a cached response?

Please add a unit test that verifies that nICEr behaves properly in response to this.
Suggest extending NrIceResolverFake

@@ +141,5 @@
> +    ABORT(R_NOT_FOUND);
> +  }
> +  // We return a void * reference to PendingResolution to call cancel on.
> +  *handle = pr.get();
> +  pr.forget(); // AddRef

This doesn't AddRef, it just detaches the smart pointer.

Also, you can just do *handle = pr.forget().get();

@@ +142,5 @@
> +  }
> +  // We return a void * reference to PendingResolution to call cancel on.
> +  *handle = pr.get();
> +  pr.forget(); // AddRef
> +  _status=0;

Whitespace before _status to separate block.

@@ +148,5 @@
> +  return _status;
> +}
> +
> +nsresult
> +NrIceResolver::PendingResolution::OnLookupComplete(nsICancelable *request,

return type and function name on the same line. If you need to, put break after (

@@ +168,5 @@
> +  }
> +  return NS_OK;
> +}
> +
> +int NrIceResolver::cancel(void *, void *handle) {

Check arguments. Please name first argument something and mark it as unused.

@@ +173,5 @@
> +  return static_cast<PendingResolution *>(handle)->cancel();
> +}
> +
> +int NrIceResolver::PendingResolution::cancel() {
> +  request_->Cancel (NS_ERROR_ABORT);

Please add some commentary about how cancellation works. Having both a cancel function and a canceled member seems
confusing. Are we sure that this is thread-safe?

::: media/mtransport/nriceresolver.h
@@ +62,5 @@
> +class NrIceResolver
> +{
> + public:
> +  NrIceResolver();
> +  ~NrIceResolver();

Blank line to separate dtor from other methods.

@@ +66,5 @@
> +  ~NrIceResolver();
> +  nsresult Init();
> +  nr_resolver *AllocateResolver();
> +  void DestroyResolver();
> +  NS_INLINE_DECL_THREADSAFE_REFCOUNTING(NrIceResolver)

Blank line here please.

@@ +79,5 @@
> +
> +  class PendingResolution : public nsIDNSListener
> +  {
> +   public:
> +    NS_DECL_ISUPPORTS

Please put this last in the public block.

@@ +81,5 @@
> +  {
> +   public:
> +    NS_DECL_ISUPPORTS
> +
> +    PendingResolution(uint16_t port,

Aren't we going to regret not having transport (TCP/UDP) here?

@@ +91,5 @@
> +    virtual ~PendingResolution(){};
> +    NS_IMETHOD OnLookupComplete(nsICancelable *request, nsIDNSRecord *record,
> +                                nsresult status);
> +    int cancel();
> +    nsCOMPtr<nsICancelable> request_;

Whitespace here please.

@@ +101,5 @@
> +  };
> +
> +  nr_resolver_vtbl* vtbl_;
> +  nsCOMPtr<nsIDNSService> dns_;
> +  int allocated_resolvers_;

Unused, right?

::: media/mtransport/test/ice_unittest.cpp
@@ +65,3 @@
>        remote_(nullptr) {
>      ice_ctx_->SignalGatheringCompleted.connect(this,
> +                                               &IceTestPeer::GatheringComplete);

Whitespace-only change?

@@ +100,5 @@
>      ASSERT_TRUE(NS_SUCCEEDED(ice_ctx_->SetStunServers(stun_servers)));
>    }
>  
> +  void SetFakeResolver(const std::string hostname = kDefaultStunServerHostname,
> +                       const std::string address = kDefaultStunServerAddress) {

Prefer not to use default arguments. Propose just using the explicit arguments or having a "default" variant.

@@ +430,5 @@
> +
> +TEST_F(IceGatherTest, TestGatherDNSStunBogusHostname) {
> +  peer_->SetStunServer(kBogusStunServerHostname, kDefaultStunServerPort);
> +  peer_->SetDNSResolver();
> +  Gather();

These tests need to verify that the actual results make sense. Please file a bug.

Blocked on: 844994

::: media/webrtc/signaling/src/peerconnection/PeerConnectionMedia.cpp
@@ +85,5 @@
>    }
> +  rv = mDNSResolver->Init();
> +  NS_ENSURE_SUCCESS(rv, rv);
> +  rv = mIceCtx->SetResolver(mDNSResolver->AllocateResolver());
> +  if (NS_FAILED(rv)) {

Idiom here seems to be:
if (NS_FAILED(..)) rather than NS_ENSURE...

::: media/webrtc/signaling/src/peerconnection/PeerConnectionMedia.h
@@ +240,5 @@
>    PeerConnectionMedia(PeerConnectionImpl *parent)
>        : mParent(parent),
>        mLocalSourceStreamsLock("PeerConnectionMedia.mLocalSourceStreamsLock"),
> +      mIceCtx(NULL),
> +      mDNSResolver(new mozilla::NrIceResolver()) {}

Please restore whitespace here.

::: media/webrtc/signaling/test/signaling_unittests.cpp
@@ +1019,5 @@
> +    nsresult rv;
> +    ioservice_ = do_GetIOService(&rv);
> +    ASSERT_TRUE(NS_SUCCEEDED(rv));
> +    dns_ = do_GetService(NS_DNSSERVICE_CONTRACTID, &rv);
> +    ASSERT_TRUE(NS_SUCCEEDED(rv));

Why isn't this stuff needed in ice_unittest.cpp?

Suspect this is bit rot....
Attachment #721088 - Flags: review?(ekr) → review-
Updated with feedback. I'm still working on the unit-test. Will add that later.

> Is there a reason to allow more than one instance of a resolver (the way that
> nriceresolverfake) does? Are you just futureproofing?

Use-proofing. I think AllocateResolver() is a better name than GetNrResolver() now that it AddRef()s. Prevents people addref'ing by accident (say like in a macro argument). And it is cheap.

> Can we do this without a conditional? I.e., just use the second version all the
> time.

NS_GetCurrentThread() is faster. Seems unfair to hurt the browser case because of the unit-tests, when it is the unit-tests that are problematic in this regard.

> +  if (NS_FAILED(that->dns_->AsyncResolve(nsAutoCString (url->domain_name),
> +                                         nsIDNSService::RESOLVE_DISABLE_IPV6,
> +                                         pr, my_NS_GetCurrentThread(),
> +                                         getter_AddRefs(pr->request_)))) {
>
> Log here. Also how can this happen? Is it an internal error? Or a cached response?

I have observed that cached responses call the callback and do not fail. Other than that I don't think we can know or assume why it would fail (now or in the future), hence the success of the "exceptions for failure paths" pattern. For our purposes, I suggest logging that it happened and fail this call only, which I've now done.

> Please add some commentary about how cancellation works. Having both a cancel
> function and a canceled member seemsvconfusing. Are we sure that this is thread-
> safe?

Yes. This is all single-threaded on the STS thread. However, cancel cannot guarantee OnLookupComplete isn't already waiting on the event queue, hence the (private) canceled_ test in OnLookupComplete.

>>      ice_ctx_->SignalGatheringCompleted.connect(this,
>> +                                               &IceTestPeer::GatheringComplete);
>
> Whitespace-only change?

Yes, didn't line up, off by one.

> These tests need to verify that the actual results make sense. Please file a bug.

Filed as Bug 848094.

>> +    nsresult rv;
>> +    ioservice_ = do_GetIOService(&rv);
>> +    ASSERT_TRUE(NS_SUCCEEDED(rv));
>> +    dns_ = do_GetService(NS_DNSSERVICE_CONTRACTID, &rv);
>> +    ASSERT_TRUE(NS_SUCCEEDED(rv));
>
>Why isn't this stuff needed in ice_unittest.cpp?

do_GetIOService is needed in both to prevent the DNS service from crashing (see comment 28). Look in mtransport_test_utils.h where I added the same stuff, making the above lines redundant I now realize. I now see that mtransport_test_utils.h is included by both these tests. Testing again I also see that opening the dns service early wasn't the necessary part, just the ioservice.

BTW: I was surprised to learn that ice_unittest.cpp runs its tests on the main thread, while signaling_unittests.cpp runs its tests on STS (observed, don't ask me why).

All the other comments have been incorporated.
Attachment #721088 - Attachment is obsolete: true
Attachment #721483 - Flags: feedback?(ekr)
Comment on attachment 721483 [details] [diff] [review]
RTCPeerConnection constructor supports FQDN STUN servers (9)

>> +  if (NS_FAILED(that->dns_->AsyncResolve(nsAutoCString (url->domain_name),
>> +                                         nsIDNSService::RESOLVE_DISABLE_IPV6,
>> +                                         pr, my_NS_GetCurrentThread(),
>> +                                         getter_AddRefs(pr->request_)))) {
>
> [...] Please add a unit test that verifies that nICEr behaves properly in
> response to this. Suggest extending NrIceResolverFake

Postponed till bug 848094. Ready for review.

https://tbpl.mozilla.org/?tree=Try&rev=c5583be94f78
Attachment #721483 - Flags: feedback?(ekr) → review?(ekr)
Forgot to remove the landed patch in my try build https://tbpl.mozilla.org/?tree=Try&rev=7b454684c0e4
Sorry, now compiles in OPT again. Fixed silly conditional-compile in last version. Up for review?

https://tbpl.mozilla.org/?tree=Try&rev=f0df3008fd3d
Attachment #721483 - Attachment is obsolete: true
Attachment #721483 - Flags: review?(ekr)
Attachment #721813 - Flags: review?(ekr)
Attachment #721813 - Attachment is patch: true
Compiles on Windows again.

>::: media/mtransport/nriceresolver.cpp
>@@ +52,5 @@
>> +#include "nr_resolver.h"
>> +#include "transport_addr.h"
>> +}
>> +
>> +#include "mozilla/net/DNS.h"
>
>Mozilla includes before nICEr includes please.

No can do, then it doesn't compile on Windows. :-( See
https://tbpl.mozilla.org/?tree=Try&rev=f0df3008fd3d

Moving DNS.h back down where I had it and... happy: https://tbpl.mozilla.org/?tree=Try&rev=99e555fbb739

I propose we file a separate bug regarding libekr/rtypes.h for that.
Attachment #721909 - Flags: review?(ekr)
Fixes port mangling in new nr_ip4_port_to_transport_addr(). Thanks Eric for spotting it.
Attachment #721813 - Attachment is obsolete: true
Attachment #721909 - Attachment is obsolete: true
Attachment #721813 - Flags: review?(ekr)
Attachment #721909 - Flags: review?(ekr)
Attachment #722052 - Flags: review?(ekr)
Make that new nr_netaddr_to_transport_addr().
Comment on attachment 721909 [details] [diff] [review]
RTCPeerConnection constructor supports FQDN STUN servers (11)

Review of attachment 721909 [details] [diff] [review]:
-----------------------------------------------------------------

Here are my comments on -11, which are a pain to merge into -12.

::: media/mtransport/nr_socket_prsock.cpp
@@ +244,5 @@
> +  nr_transport_addr *addr)
> +  {
> +    int _status;
> +    int r;
> +    switch(netaddr->raw.family) {

Whitespace after declarations please.

@@ +247,5 @@
> +    int r;
> +    switch(netaddr->raw.family) {
> +      case AF_INET:
> +        if ((r = nr_ip4_port_to_transport_addr(ntohl(netaddr->inet.ip),
> +                                               netaddr->inet.port,

Is port really in host order here?

@@ +253,5 @@
> +          ABORT(r);
> +        break;
> +      case AF_INET6:
> +        ABORT(R_BAD_ARGS);
> +      default:

Please assert here.

::: media/mtransport/nr_socket_prsock.h
@@ +55,5 @@
>  #include "nsASocketHandler.h"
>  #include "nsISocketTransportService.h"
>  #include "nsXPCOM.h"
>  #include "nsIEventTarget.h"
> +#include "mozilla/net/DNS.h"

Please forward declare NetAddr * instead. You don't need to see the defn here.

::: media/mtransport/nriceresolver.cpp
@@ +50,5 @@
> +#include "nr_api.h"
> +#include "async_timer.h"
> +#include "nr_resolver.h"
> +#include "transport_addr.h"
> +}

Please move nICer includes downward.

@@ +132,5 @@
> +  return 0;
> +}
> +
> +// Needed for unit-tests
> +static nsIThread *my_NS_GetCurrentThread() {

Please add a comment about why this doesn't work in the unit tests.

@@ +158,5 @@
> +    MOZ_MTLOG(PR_LOG_ERROR, "Only UDP is supported.");
> +    ABORT(R_NOT_FOUND);
> +  }
> +  pr = new PendingResolution(resource->port? resource->port : 3478, cb, cb_arg);
> +  if (NS_FAILED(that->dns_->AsyncResolve(nsAutoCString (resource->domain_name),

Prefer rv=that->
if (NS_FAILED(...))

@@ +188,5 @@
> +    nr_transport_addr ta;
> +    // TODO(jib@mozilla.com): Revisit when we do TURN.
> +    if (NS_SUCCEEDED(status)) {
> +      net::NetAddr na;
> +      if (NS_SUCCEEDED(record->GetNextAddr(port_, &na))) {

Prefer rv=...; if (NS_

@@ +189,5 @@
> +    // TODO(jib@mozilla.com): Revisit when we do TURN.
> +    if (NS_SUCCEEDED(status)) {
> +      net::NetAddr na;
> +      if (NS_SUCCEEDED(record->GetNextAddr(port_, &na))) {
> +        MOZ_ALWAYS_FALSE (nr_netaddr_to_transport_addr(&na, &ta));

This seems confusing, since false usually implies failure in boolean contexts. What it is is always 0.

Suggest
r=...; MOZ_ALWAYS_TRUE(r==0)

@@ +200,5 @@
> +  return NS_OK;
> +}
> +
> +int NrIceResolver::cancel(void *obj, void *handle) {
> +  (void)obj;

Please MOZ_ASSERT arguments nonzero.

::: media/mtransport/test/ice_unittest.cpp
@@ +65,3 @@
>        remote_(nullptr) {
>      ice_ctx_->SignalGatheringCompleted.connect(this,
> +                                               &IceTestPeer::GatheringComplete);

Whitespace only cahnge?

@@ +415,5 @@
> +  Gather();
> +}
> +
> +TEST_F(IceGatherTest, TestGatherDNSStunServerIpAddress) {
> +  peer_->SetStunServer(kDefaultStunServerAddress, kDefaultStunServerPort);

Please add a TODO and a bug to verify the results make sense. I.e., that you get server reflexive candidates

::: media/webrtc/signaling/src/peerconnection/PeerConnectionMedia.cpp
@@ +79,5 @@
>      CSFLogError(logTag, "%s: Failed to create Ice Context", __FUNCTION__);
>      return NS_ERROR_FAILURE;
>    }
> +  nsresult rv;
> +  if (NS_FAILED(rv = mIceCtx->SetStunServers(stun_servers)) ||

I would prefer individual blocks with their own logging. That way we can detect resolver errors at this level without having transport logging on.

Also, I am surprised that this assignment used as truth value doesn't produce a warning.
Lucky thirteen. Fixed nits as discussed.
- Added STS thread assertions which I forgot from first review.
Attachment #722052 - Attachment is obsolete: true
Attachment #722052 - Flags: review?(ekr)
Attachment #722455 - Flags: review?(ekr)
Comment on attachment 722455 [details] [diff] [review]
RTCPeerConnection constructor supports FQDN STUN servers (13)

Review of attachment 722455 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm
Attachment #722455 - Flags: review?(ekr) → review+
Comment on attachment 722455 [details] [diff] [review]
RTCPeerConnection constructor supports FQDN STUN servers (13)

https://tbpl.mozilla.org/?tree=Try&rev=40ad7325355a
Attachment #722455 - Flags: checkin?(ekr)
Attachment #722455 - Flags: checkin?(ekr)
Fixes mochitest2 crash on Windows debug build.
- Removed unwise back-ptr in latest ASSERT_ON_THREAD(). Crashed because
  PendingResolution events can outlive NrIceResolver. Pass in thread ptr instead.

https://tbpl.mozilla.org/?tree=Try&rev=00079e35a7d8
Attachment #722455 - Attachment is obsolete: true
Attachment #722875 - Flags: review?(ekr)
> Backed out for frequent crashtest and mochitest orange.

Thanks Ryan, and sorry about that. I'm confident this new patch addresses the mochitest crash (though the try in Comment 45 will tell). But the other crashtest fails you link to I'm more stumped by (they're not in my try in Comment 42).

Not sure how STUN resolution can affect TransportFlow, though resolution-callbacks may drive this. Ekr, any thoughts on how this can be related?
https://tbpl.mozilla.org/php/getParsedLog.php?id=20461534&tree=Mozilla-Inbound#error7
I don't think this bug is the problem. I suspect it's something screwy with running API calls while still in PC callbacks.
Comment on attachment 722875 [details] [diff] [review]
RTCPeerConnection constructor supports FQDN STUN servers (14)

Review of attachment 722875 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm

::: media/mtransport/nriceresolver.h
@@ +92,5 @@
> +        thread_(thread),
> +        port_(port),
> +        cb_(cb), cb_arg_(cb_arg),
> +        canceled_ (false){
> +          (void)thread;

You shouldn't need the void since we always use this. Also, please add space before {
Attachment #722875 - Flags: review?(ekr) → review+
Updated with nit and removed unused var. The remaining crash is believed to be bug 841981 instead.
Attachment #722875 - Attachment is obsolete: true
Attachment #722901 - Flags: review+
Attachment #722901 - Flags: checkin?(ekr)
Comment on attachment 722901 [details] [diff] [review]
RTCPeerConnection constructor supports FQDN STUN servers (15) r=ekr

You can just set the checkin-needed keyword on the bug, for future reference.
Attachment #722901 - Flags: checkin?(ekr)
https://hg.mozilla.org/mozilla-central/rev/9e6232e86000
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla22
Keywords: verifyme
So I modified a local test case I've attached here to use the specified URL in comment 0 for verification. I'm seeing both peer connections getting created, but createOffer is failing here with INTERNAL_ERROR.

Jan-Ivar - Any ideas why? Is this a bug or an issue with the test case?
Flags: needinfo?(jib)
Compare to http://mozilla.github.io/webrtc-landing/pc_test.html

You haven't added any streams, which is why it fails. I'm not sure what the standard says about this, but your two PeerConnections essentially have nothing to talk about. When I add a stream to one of the PCs it works.

I don't know enough about data-channels, bu since the createOffer constraint "MozDontOfferDataChannel" is worded negatively (defaults to offer data channels), maybe creating a connection without any streams should be allowed? I don't know, I'll ask Randell how this should work.

But regardless, the error you get is pretty terrible. So I see at least two issues:

 1) This should at the very least be a readable non-internal pilot error, and

 2) The error object is opaque/blank unless it is logged DIRECTLY to console.log().
    immediately. Any indirection, like function(err){ failed(err); } gives blank.

Please file a new bug and assign to me. Thanks.
Flags: needinfo?(jib) → needinfo?(rjesup)
Everything but two situations return INTERNAL_ERROR right now. I am fixing this as soon as I am done working on tasks that will actually block keeping us prefer on in 22.
(In reply to Jan-Ivar Bruaroey [:jib] from comment #55)
> Compare to http://mozilla.github.io/webrtc-landing/pc_test.html
> 
> You haven't added any streams, which is why it fails. I'm not sure what the
> standard says about this, but your two PeerConnections essentially have
> nothing to talk about. When I add a stream to one of the PCs it works.

Ah, good catch. The test case now works for me after adding respective real streams.

> 
> I don't know enough about data-channels, bu since the createOffer constraint
> "MozDontOfferDataChannel" is worded negatively (defaults to offer data
> channels), maybe creating a connection without any streams should be
> allowed? I don't know, I'll ask Randell how this should work.

Good point. I'd actually argue that we should change the media constraint, not the default behavior here. This means I think we should do the following:

1. Keep the same behavior if you call createOffer without adding a stream or providing an appropriate mandatory media constraint by failing here.
2. Modify the name of the media constraint here to instead be "MozOfferDataChannel" that does the reverse here - we're expecting an offer of a data channel as defined in the constraint worded in the positive direction, rather than the opposite. This also aligns with other media constraints that are already implemented.
Status: RESOLVED → VERIFIED
Keywords: verifyme
As for the error object comment, let's just use the bug Adam is already working on to track the work to improve the error codes. Unless we want a specific bug for this specific error case.
Flags: needinfo?(rjesup)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: