Closed Bug 839647 Opened 12 years ago Closed 12 years ago

Intermittent ASSERTION: Unsupported Signaling State Transition: 'Not Reached', file media/webrtc/signaling/src/peerconnection/PeerConnectionCtx.cpp, line 224

Categories

(Core :: WebRTC: Signaling, defect)

All
macOS
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla22

People

(Reporter: whimboo, Assigned: abr)

References

()

Details

(Keywords: assertion, intermittent-failure, Whiteboard: [WebRTC][blocking-webrtc+])

Attachments

(2 files, 1 obsolete file)

Clicking start in the test page causes the following assertion to throw with changeset 121190:eb8f60c782da

###!!! ASSERTION: Unsupported Signaling State Transition: 'Not Reached', file /Volumes/data/code/firefox/nightly/media/webrtc/signaling/src/peerconnection/PeerConnectionCtx.cpp, line 224
CSF::CallControlManagerImpl::onDeviceEvent(ccapi_device_event_e, linked_ptr<CSF::CC_Device>, linked_ptr<CSF::CC_DeviceInfo>)+0x0000005C [/Volumes/data/code/firefox/obj/debug/dist/NightlyDebug.app/Contents/MacOS/XUL +0x01F950DC]
CSF::CC_SIPCCService::notifyDeviceEventObservers(ccapi_device_event_e, linked_ptr<CSF::CC_Device>, linked_ptr<CSF::CC_DeviceInfo>)+0x0000009D [/Volumes/data/code/firefox/obj/debug/dist/NightlyDebug.app/Contents/MacOS/XUL +0x01FB8AFD]
CSF::CC_SIPCCService::onDeviceEvent(ccapi_device_event_e, unsigned int, cc_device_info_t_*)+0x00000794 [/Volumes/data/code/firefox/obj/debug/dist/NightlyDebug.app/Contents/MacOS/XUL +0x01FB4484]
ccsnap_gen_deviceEvent+0x000001EE [/Volumes/data/code/firefox/obj/debug/dist/NightlyDebug.app/Contents/MacOS/XUL +0x01FE58FE]
cc_media_update_native_video_txcap+0x0000004A [/Volumes/data/code/firefox/obj/debug/dist/NightlyDebug.app/Contents/MacOS/XUL +0x0201EB0A]
ccp_handler+0x000001B4 [/Volumes/data/code/firefox/obj/debug/dist/NightlyDebug.app/Contents/MacOS/XUL +0x01FE7084]
CCApp_task+0x0000015F [/Volumes/data/code/firefox/obj/debug/dist/NightlyDebug.app/Contents/MacOS/XUL +0x01FE6C8F]
_pthread_start+0x0000014F [/usr/lib/system/libsystem_c.dylib +0x0004E8BF]
Whiteboard: [WebRTC][blocking-webrtc?]
I think this may be a dup.
Assignee: nobody → ethanhugg
Whiteboard: [WebRTC][blocking-webrtc?] → [WebRTC][blocking-webrtc-]
This should've been fixed by Bug 806825, but evidently not.  Could you give a pointer to the exact test page you ran and did it happen all the time, often, or rarely?  Thanks.
Summary: ASSERTION: Unsupported Signaling State Transition: 'Not Reached', file media/webrtc/signaling/src/peerconnection/PeerConnectionCtx.cpp, line 224 → Intermittent ASSERTION: Unsupported Signaling State Transition: 'Not Reached', file media/webrtc/signaling/src/peerconnection/PeerConnectionCtx.cpp, line 224
Whiteboard: [WebRTC][blocking-webrtc-] → [WebRTC][blocking-webrtc+]
Blocks: 826044
I'm pretty sure that this is what happens if Bug 841150 occurs with debug turned on. In other words, if this assert doesn't take us down, I believe you'll end up with the "unexpected error callback with 0" issues.
jseup from IRC: "Try for -b d -p linux64 -u crashtest -t none gets failures 100% of the time with crashtests enabled using the patches in https://tbpl.mozilla.org/?tree=Try&rev=1722601b6cff"
Reassigning to abr since the current theory is that this bug is a dup of (or at least has the same root cause as) Bug 841150 (see comment 3).  He is investigating this and Bug 811450 and will deal with both once his investigation is done. (Adam -- please correct me if I got any of this wrong.  Thanks.)
Assignee: ethanhugg → adam
(Missing the keyword, won't appear on TBPL)
(In reply to Ethan Hugg [:ehugg] from comment #2)
> This should've been fixed by Bug 806825, but evidently not.  Could you give
> a pointer to the exact test page you ran and did it happen all the time,
> often, or rarely?  Thanks.

Please see the URL field of this bug, and yes it's happening all the time with the given changeset. I will refresh my build and retest.
It happens 100% of the time on Ubuntu Debug builds on Try in crashtest 791270.html
Still happens with 122218:524e7bc67431
Without the synchronization, it turns out that the first couple of device events (CCAPI_DEVICE_EV_VIDEO_CAP_ADMIN_CONFIG_CHANGED and CCAPI_DEVICE_EV_CAMERA_ADMIN_CONFIG_CHANGED) *tended* to happen before device event listeners were all registered; and the one that we absolutely must see (CCAPI_DEVICE_EV_STATE) *tended* to happen afterwards.

Because the exact timing of when the event listener was set up was a race, we ended up sometimes seeing one or both of the first two events (causing this assert), and I believe we also ended up sometimes missing the final event (which means certain initialization procedures would never be run, which I *think* would lead to the situation described in bug 841150).

The attached patch fixes the race by preventing the CCApp thread from processing any messages before the CallControlManagerImpl is done setting everything up, and deals with the PeerConnectionCtx seeing the first two event by checking the event types before attempting to take any action. Doing so demonstrably fixes *this* bug, and I expect it fixes Bug 841150 as well.
Attachment #716855 - Flags: review?(ethanhugg)
Here's a try to push with crash tests to verify that the issue with 791270.html has been resolved:

https://tbpl.mozilla.org/?tree=Try&rev=dafe6dc21576
Comment on attachment 716905 [details] [diff] [review]
Remove crash test workaround

Jesup -- I presume this is all I need to do to reverse the workaround?
Attachment #716905 - Flags: review?(rjesup)
Apparently, linking in externals for Windows involves some form of goat sacrifice in which I am not practiced. I think I'll need to tap some build expert's knowledge to get the Windows build to work:

https://tbpl.mozilla.org/php/getParsedLog.php?id=19973051&tree=Try
Comment on attachment 716855 [details] [diff] [review]
Synchronize CCApp thread start-up

Review of attachment 716855 [details] [diff] [review]:
-----------------------------------------------------------------

::: media/webrtc/signaling/src/callcontrol/CallControlManagerImpl.cpp
@@ +22,5 @@
>  {
>  #include "config_api.h"
>  }
>  
> +extern PRCondVar *ccAppReadyToStartCond;

I believe the C++ version of this can be found in mozilla/Monitor.h   Is there a reason you didn't use that one?

::: media/webrtc/signaling/src/peerconnection/PeerConnectionCtx.cpp
@@ +17,5 @@
>  #include "PeerConnectionImpl.h"
>  #include "PeerConnectionCtx.h"
>  #include "runnable_utils.h"
>  #include "cpr_socket.h"
> +#include "debug-psipcc-types.h"

It appears to me that you might be adding this include because it includes ccapi_types.h.  I didn't recompile to find out, but I would imagine you'd be better off just including ccapi_types.h which is where CCAPI_DEVICE_EV_STATE is defined.

::: media/webrtc/signaling/src/sipcc/core/ccapp/ccapp_task.c
@@ +14,5 @@
>  extern void CCAppInit();
>  static sll_lite_list_t sll_list;
>  
> +PRCondVar *ccAppReadyToStartCond = 0;
> +PRLock *ccAppReadyToStartLock = 0;

This code usually uses NULL for initializing pointers.  The Mozilla code style guide is a bit ambiguous on this, but I think it's worth looking at.
Attachment #716855 - Flags: review?(ethanhugg) → review+
Here's a re-try on Windows on the theory that the linking problem might be due to symbol mangling differences:

https://tbpl.mozilla.org/?tree=Try&rev=4ff63bd3cce3
Attachment #716905 - Flags: review?(rjesup) → review+
(In reply to Adam Roach [:abr] from comment #17)
> Here's a re-try on Windows on the theory that the linking problem might be
> due to symbol mangling differences:
> 
> https://tbpl.mozilla.org/?tree=Try&rev=4ff63bd3cce3

Referencing var defined in .c file -> use extern "C" {}

Looks right now, r+
More precisely, extern "C" {} means "use C linkage". You can do it in C++ files as well...
Attachment #716855 - Attachment is obsolete: true
Comment on attachment 717150 [details] [diff] [review]
Synchronize CCApp thread start-up

r=ehugg
Attachment #717150 - Flags: review+
Attachment #716905 - Flags: checkin+
Attachment #717150 - Flags: checkin+
(In reply to Ethan Hugg [:ehugg] from comment #16)
> Comment on attachment 716855 [details] [diff] [review]
> Synchronize CCApp thread start-up
> 
> Review of attachment 716855 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: media/webrtc/signaling/src/callcontrol/CallControlManagerImpl.cpp
> @@ +22,5 @@
> >  {
> >  #include "config_api.h"
> >  }
> >  
> > +extern PRCondVar *ccAppReadyToStartCond;
> 
> I believe the C++ version of this can be found in mozilla/Monitor.h   Is
> there a reason you didn't use that one?

The ccapp_task.c file is where this condition is waited for, so we need to use the C variations.

> ::: media/webrtc/signaling/src/peerconnection/PeerConnectionCtx.cpp
> @@ +17,5 @@
> >  #include "PeerConnectionImpl.h"
> >  #include "PeerConnectionCtx.h"
> >  #include "runnable_utils.h"
> >  #include "cpr_socket.h"
> > +#include "debug-psipcc-types.h"
> 
> It appears to me that you might be adding this include because it includes
> ccapi_types.h.  I didn't recompile to find out, but I would imagine you'd be
> better off just including ccapi_types.h which is where CCAPI_DEVICE_EV_STATE
> is defined.

The key reason for including this file is to get the declaration for device_event_getname()

> ::: media/webrtc/signaling/src/sipcc/core/ccapp/ccapp_task.c
> @@ +14,5 @@
> >  extern void CCAppInit();
> >  static sll_lite_list_t sll_list;
> >  
> > +PRCondVar *ccAppReadyToStartCond = 0;
> > +PRLock *ccAppReadyToStartLock = 0;
> 
> This code usually uses NULL for initializing pointers.  The Mozilla code
> style guide is a bit ambiguous on this, but I think it's worth looking at.

Good catch. Fixed.
https://hg.mozilla.org/mozilla-central/rev/4d6b042ba9fa
https://hg.mozilla.org/mozilla-central/rev/bee4a533e23e
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla22
Verified with crashtest on check in.
Status: RESOLVED → VERIFIED
Whiteboard: [WebRTC][blocking-webrtc+] → [WebRTC][blocking-webrtc+][qa-]
Whiteboard: [WebRTC][blocking-webrtc+][qa-] → [WebRTC][blocking-webrtc+]
Flags: in-testsuite+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: