PeerConnectionImpl needs better error handling and logging

RESOLVED FIXED in mozilla18

Status

()

Core
WebRTC: Signaling
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: ehugg, Assigned: ehugg)

Tracking

unspecified
mozilla18
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [WebRTC], [blocking-webrtc+] [qa-])

Attachments

(1 attachment, 4 obsolete attachments)

(Assignee)

Description

5 years ago
PeerConnectionImpl needs better error handling and reporting.  Specifically:

- Add error checking of all function calls - sometime we assert but continue, and sometimes we don't check at all.
- Add error checking of parameters at top of each method.
- Remove some unneeded code such as the body to NotifyQueuedTrackChanges 
- Make logging more consistent - CSFLogErrorS when error, DebugS when debug, use stream versions rather than printf-style versions.
(Assignee)

Updated

5 years ago
Assignee: nobody → ethanhugg
(Assignee)

Comment 1

5 years ago
Created attachment 663566 [details] [diff] [review]
Patch 1 - PeerConnectionImpl needs better error handling and logging
(Assignee)

Comment 2

5 years ago
Comment on attachment 663566 [details] [diff] [review]
Patch 1 - PeerConnectionImpl needs better error handling and logging


I decided to break this into a couple patches.  If you agree with what I've done so far, I can finish PeerConnectionImpl and PeerConnectionCtx the same way in a second patch.
Attachment #663566 - Flags: feedback?(ekr)

Comment 3

5 years ago
Comment on attachment 663566 [details] [diff] [review]
Patch 1 - PeerConnectionImpl needs better error handling and logging

Review of attachment 663566 [details] [diff] [review]:
-----------------------------------------------------------------

This looks like a good start. I think you may be being a little aggressive with the PR_ASSERTs. some of these things seem like they might legitimately fail for some unforeseen runtime conditions, and we don't want to crash. I know I had a lot of asserts, but I think we need to be more cautious when we actually plan to ship it.  I've commented the ones I think are most relevant.

::: media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.cpp
@@ +256,5 @@
>  PeerConnectionImpl* PeerConnectionImpl::CreatePeerConnection()
>  {
>    PeerConnectionImpl *pc = new PeerConnectionImpl();
>  
> +  if (!pc)

I believe that new already returns a valid object (or crashes), so you don't need to check this.

@@ +316,4 @@
>  
> +  CSFLogDebugS(logTag, "PeerConnection " << static_cast<void*>(this) 
> +    << ": Created media stream " << static_cast<void*>(stream.get()) 
> +    << " inner: " << static_cast<void*>(stream->GetStream()));

Do you need to do stream,get() as the input to the static cast?

@@ +325,3 @@
>  PeerConnectionImpl::MakeRemoteSource(nsDOMMediaStream* stream, RemoteSourceStreamInfo** info)
>  {
> +  PR_ASSERT(info);

I could live without non-assert checks on arguments if it's a programming error for them to be wrong.

@@ +362,3 @@
>      ), NS_DISPATCH_SYNC);
>    }
> +  PR_ASSERT(NS_SUCCEEDED(res));

We should back this up with a non-assert check in case someone compiles without assert.

@@ +378,2 @@
>    ), NS_DISPATCH_SYNC);
> +  PR_ASSERT(NS_SUCCEEDED(res));

Same as above.

@@ +418,5 @@
>    }
>  
>    mCall = pcctx->createCall();
> +
> +  PR_ASSERT(mCall.get());

I'm not sure you need to .get() on these because they declare operator T*

@@ +454,5 @@
> +  RefPtr<NrIceMediaStream> audioStream = mIceCtx->CreateStream("stream1", 2);
> +  RefPtr<NrIceMediaStream> videoStream = mIceCtx->CreateStream("stream2", 2);
> +  RefPtr<NrIceMediaStream> dcStream = mIceCtx->CreateStream("stream3", 2);
> +
> +  PR_ASSERT(audioStream.get());

I think maybe remove these asserts? It seems like there might be legit reasons why this could happen.

@@ +489,5 @@
>  
>    // Create the DTLS Identity
>    mIdentity = DtlsIdentity::Generate("self");
>  
> +  PR_ASSERT(mIdentity.get());

Same here.

@@ +554,3 @@
>      ), NS_DISPATCH_SYNC);
>    }
> +  PR_ASSERT(NS_SUCCEEDED(res));

This could totally fail, I suspect.
Attachment #663566 - Flags: feedback?(ekr) → feedback+
Comment on attachment 663566 [details] [diff] [review]
Patch 1 - PeerConnectionImpl needs better error handling and logging

Review of attachment 663566 [details] [diff] [review]:
-----------------------------------------------------------------

::: media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.cpp
@@ +256,5 @@
>  PeerConnectionImpl* PeerConnectionImpl::CreatePeerConnection()
>  {
>    PeerConnectionImpl *pc = new PeerConnectionImpl();
>  
> +  if (!pc)

Eric is right, infallible malloc means new can't fail

@@ +306,3 @@
>  PeerConnectionImpl::MakeMediaStream(PRUint32 hint, nsIDOMMediaStream** retval)
>  {
> +  PR_ASSERT(retval);

MOZ_ASSERT for debug-only asserts, NS_ASSERTION for production asserts, generally.

@@ +325,4 @@
>  PeerConnectionImpl::MakeRemoteSource(nsDOMMediaStream* stream, RemoteSourceStreamInfo** info)
>  {
> +  PR_ASSERT(info);
> +  if (!info) {

Agree with ekr; either use NS_ASSERTION and include the test in production code (and drop the "if (!...)", or decide it's only worth checking for while debugging and use MOZ_ASSERT (and drop the "if (!....)")
(Assignee)

Comment 5

5 years ago
Created attachment 664131 [details] [diff] [review]
Patch 1 - PeerConnectionImpl needs better error handling and logging

Comment 6

5 years ago
Comment on attachment 664131 [details] [diff] [review]
Patch 1 - PeerConnectionImpl needs better error handling and logging

Review of attachment 664131 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm with one trivial comment

::: media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.cpp
@@ +415,5 @@
> +  RefPtr<NrIceMediaStream> audioStream = mIceCtx->CreateStream("stream1", 2);
> +  RefPtr<NrIceMediaStream> videoStream = mIceCtx->CreateStream("stream2", 2);
> +  RefPtr<NrIceMediaStream> dcStream = mIceCtx->CreateStream("stream3", 2);
> +
> +  if (!audioStream) {

IMO we should return an error for this failure.
(Assignee)

Comment 7

5 years ago
Created attachment 664136 [details] [diff] [review]
Patch 1 - PeerConnectionImpl needs better error handling and logging
(Assignee)

Updated

5 years ago
Attachment #664131 - Attachment is obsolete: true
(Assignee)

Updated

5 years ago
Attachment #663566 - Attachment is obsolete: true
(Assignee)

Comment 8

5 years ago
Created attachment 664145 [details] [diff] [review]
Patch 1 - PeerConnectionImpl needs better error handling and logging
(Assignee)

Updated

5 years ago
Attachment #664136 - Attachment is obsolete: true
(Assignee)

Comment 9

5 years ago
Comment on attachment 664145 [details] [diff] [review]
Patch 1 - PeerConnectionImpl needs better error handling and logging


get() is not needed for nsRefPtrs but it is still required for the ones that derive from linked_ptr like CC_CallPtr.

I used MOZ_ASSERT for things that "can't happen" including null pointer params and runtime error checking/setting nsresult for things that could.

Removed checking of memory allocators.
Attachment #664145 - Flags: feedback?(rjesup)
Comment on attachment 664145 [details] [diff] [review]
Patch 1 - PeerConnectionImpl needs better error handling and logging

Review of attachment 664145 [details] [diff] [review]:
-----------------------------------------------------------------

r+=me modulo dealing with the nits.

::: media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.cpp
@@ +338,5 @@
>      ), NS_DISPATCH_SYNC);
>    }
>  
> +  if (NS_FAILED(res)) {
> +    return NS_ERROR_FAILURE;

Is there a reason for not using "return res" here? Do we need to clean things to NS_OK/NS_ERROR_FAILURE?

@@ +357,4 @@
>    ), NS_DISPATCH_SYNC);
> +
> +  if (NS_FAILED(res)) {
> +    return NS_ERROR_FAILURE;

Is there a reason for not using "return res" here?  Do we need to clean things to NS_OK/NS_ERROR_FAILURE?

@@ +358,5 @@
> +
> +  if (NS_FAILED(res)) {
> +    return NS_ERROR_FAILURE;
> +  }
> +  

very minor, trailing space

@@ +387,4 @@
>  
>    mCall = pcctx->createCall();
> +
> +  MOZ_ASSERT(mCall.get());

.get()?  or MOZ_ASSERT(mCall)?

@@ +446,5 @@
>    );
> +
> +  if (NS_FAILED(res)) {
> +    CSFLogErrorS(logTag, __FUNCTION__ << ": StartGathering failed: " << res);
> +    return NS_ERROR_FAILURE;

NS_ERROR_FAILURE, or "return res;"?  (Repeat for most other uses of this idiom in the patch)

@@ +1051,5 @@
>  PeerConnectionImpl::onCallEvent(ccapi_call_event_e callEvent,
>    CSF::CC_CallPtr call, CSF::CC_CallInfoPtr info)
>  {
> +  MOZ_ASSERT(call.get());
> +  MOZ_ASSERT(info.get());

.get() or MOZ_ASSERT(info)?
Attachment #664145 - Flags: feedback?(rjesup) → feedback+
(Assignee)

Comment 11

5 years ago
>@@ +1051,5 @@
>>  PeerConnectionImpl::onCallEvent(ccapi_call_event_e callEvent,
>>    CSF::CC_CallPtr call, CSF::CC_CallInfoPtr info)
>>  {
>> +  MOZ_ASSERT(call.get());
>> +  MOZ_ASSERT(info.get());
>
>.get() or MOZ_ASSERT(info)?

CC_CallPtr and CC_CallInfoPtr derive from linked_ptr which needs this .get() - won't compile without it.

Good point on re-mapping res to ERROR_FAILURE, I'll fix all of those.
(Assignee)

Comment 12

5 years ago
Created attachment 664216 [details] [diff] [review]
Patch 1 - PeerConnectionImpl needs better error handling and logging
(Assignee)

Updated

5 years ago
Attachment #664145 - Attachment is obsolete: true
(Assignee)

Comment 13

5 years ago
Comment on attachment 664216 [details] [diff] [review]
Patch 1 - PeerConnectionImpl needs better error handling and logging


Pushed to Alder - http://hg.mozilla.org/projects/alder/rev/2b62d30467e7

Updated

5 years ago
Whiteboard: [WebRTC], [blocking-webrtc+]
(Assignee)

Updated

5 years ago
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla18
Ethan - Is this testable through hitting the DOM API with PeerConnection? I'm wondering if it would be worth running some quick DOM tests here or not.
Whiteboard: [WebRTC], [blocking-webrtc+] → [WebRTC], [blocking-webrtc+], [qa?]

Updated

5 years ago
Flags: needinfo?(ethanhugg)
(Assignee)

Comment 15

5 years ago
(In reply to Jason Smith [:jsmith] from comment #14)
> Ethan - Is this testable through hitting the DOM API with PeerConnection?
> I'm wondering if it would be worth running some quick DOM tests here or not.

Yes, your fuzz testing is already testing this.  Essentially there were error conditions where did the wrong things - crashed instead of returning NS_ERROR_FAILURE, or didn't check return codes of subroutines so  we would continue and crash later, things like that.  We're still refining this as you've seen from 791270 and 791278 where I didn't realize the JS layer would pass through NULLs and we had MOZ_ASSERTs instead of runtime checking.
Flags: needinfo?(ethanhugg)

Updated

5 years ago
Keywords: verifyme
Whiteboard: [WebRTC], [blocking-webrtc+], [qa?] → [WebRTC], [blocking-webrtc+]
(In reply to Ethan Hugg [:ehugg] from comment #15)
> (In reply to Jason Smith [:jsmith] from comment #14)
> > Ethan - Is this testable through hitting the DOM API with PeerConnection?
> > I'm wondering if it would be worth running some quick DOM tests here or not.
> 
> Yes, your fuzz testing is already testing this.  Essentially there were
> error conditions where did the wrong things - crashed instead of returning
> NS_ERROR_FAILURE, or didn't check return codes of subroutines so  we would
> continue and crash later, things like that.  We're still refining this as
> you've seen from 791270 and 791278 where I didn't realize the JS layer would
> pass through NULLs and we had MOZ_ASSERTs instead of runtime checking.

Okay. The security assurance team has been managing this, so I'll keep this out of my scope testing wise (I'm focusing on the other areas outside of the fuzzing).
Keywords: verifyme
Whiteboard: [WebRTC], [blocking-webrtc+] → [WebRTC], [blocking-webrtc+] [qa-]
You need to log in before you can comment on or make changes to this bug.