Closed
Bug 1226200
Opened 9 years ago
Closed 9 years ago
Crash when Hello link-clicker join a room with gUM running in content process
Categories
(Core :: WebRTC, defect, P1)
Core
WebRTC
Tracking
()
RESOLVED
FIXED
mozilla46
People
(Reporter: mikedeboer, Assigned: jesup)
References
Details
Attachments
(3 files, 1 obsolete file)
130.43 KB,
text/plain
|
Details | |
1.35 KB,
patch
|
Details | Diff | Splinter Review | |
14.74 KB,
patch
|
jdm
:
review+
Sylvestre
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
(Summary prone to reformulation, I know ;) )
We're in the process of making about:loopconversation - the Hello conversation window - run in the content process to make sure that browser sharing will work when e10s is enabled by default.
However, when I apply patch https://bugzilla.mozilla.org/attachment.cgi?id=8689072 from bug 1154277, open a room and join it by myself in a different browser (instance) it crashes Fx consistently with attached crash signature when the link clicker seems almost ready to join (e.g. already past the gUM prompt).
Does the media team have the opportunity to look at this, perhaps? Please let me know if you need anything else, besides the patch mentioned above, to reproduce the crash.
Reporter | ||
Comment 1•9 years ago
|
||
(I'm running a debug build, so this is the crash signature as it appeared on the command line:
Assertion failure: aManagees.Count() == 1, at ../../dist/include/mozilla/ipc/ProtocolUtils.h:344
#01: mozilla::net::PTCPSocketParent::OnMessageReceived(IPC::Message const&)[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x73e8b9]
#02: mozilla::dom::PContentParent::OnMessageReceived(IPC::Message const&)[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x89a32b]
#03: mozilla::ipc::MessageChannel::DispatchAsyncMessage(IPC::Message const&)[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x4f68cb]
#04: mozilla::ipc::MessageChannel::DispatchMessage(IPC::Message const&)[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x4f5442]
#05: mozilla::ipc::MessageChannel::OnMaybeDequeueOne()[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x4ef9a4]
#06: MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&)[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x4be90c]
#07: MessageLoop::DoWork()[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x4bec4b]
#08: mozilla::ipc::DoWorkRunnable::Run()[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x4fa4f5]
#09: nsThread::ProcessNextEvent(bool, bool*)[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0xe93a7]
#10: NS_ProcessPendingEvents(nsIThread*, unsigned int)[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x12957f]
#11: nsBaseAppShell::NativeEventCallback()[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x2a492e4]
#12: nsAppShell::ProcessGeckoEvents(void*)[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x2ab02ce]
#13: __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__[/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation +0x80a01]
#14: __CFRunLoopDoSources0[/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation +0x72b8d]
#15: __CFRunLoopRun[/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation +0x721bf]
#16: CFRunLoopRunSpecific[/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation +0x71bd8]
#17: RunCurrentEventLoopInMode[/System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox +0x3256f]
#18: ReceiveNextEventCommon[/System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox +0x322ea]
#19: _BlockUntilNextEventMatchingListInModeWithFilter[/System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox +0x3212b]
#20: _DPSNextEvent[/System/Library/Frameworks/AppKit.framework/Versions/C/AppKit +0x918ab]
#21: -[NSApplication nextEventMatchingMask:untilDate:inMode:dequeue:][/System/Library/Frameworks/AppKit.framework/Versions/C/AppKit +0x90e58]
#22: -[GeckoNSApplication nextEventMatchingMask:untilDate:inMode:dequeue:][/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x2aaf5e6]
#23: -[NSApplication run][/System/Library/Frameworks/AppKit.framework/Versions/C/AppKit +0x86af3]
#24: nsAppShell::Run()[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x2ab09f5]
#25: nsAppStartup::Run()[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x3612599]
#26: XREMain::XRE_mainRun()[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x3686ae0]
#27: XREMain::XRE_main(int, char**, nsXREAppData const*)[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x3687368]
#28: XRE_main[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x36878dd]
#29: main[/Users/mikedeboer/Projects/fx-team/./obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/firefox-bin +0x214f]
[Child 4856] WARNING: didn't get an id (ipc/rcv) timed out 10004003
: file /Users/mikedeboer/Projects/fx-team/ipc/glue/SharedMemoryBasic_mach.mm, line 619
[Child 4856] ###!!! ABORT: corrupted actor state: file ./PCameras.cpp, line 34
[Child 4856] ###!!! ABORT: Aborting on channel error.: file /Users/mikedeboer/Projects/fx-team/ipc/glue/MessageChannel.cpp, line 1765
./run: line 2: 4851 Segmentation fault: 11 ./obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/firefox-bin -p dev -jsconsole
#01: mozilla::media::LambdaRunnable<mozilla::camera::CamerasChild::Shutdown()::$_8>::Run()[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x23ce30d]
#01: mozilla::ipc::ProcessLink::OnChannelError()[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x4f99a7]
#02: nsThread::ProcessNextEvent(bool, bool*)[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0xe93a7]
#02: IPC::Channel::ChannelImpl::OnFileCanReadWithoutBlocking(int)[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x4d0ea9]
#03: NS_ProcessNextEvent(nsIThread*, bool)[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x129673]
[Child 4856] WARNING: didn't get an id (ipc/rcv) timed out 10004003
: file /Users/mikedeboer/Projects/fx-team/ipc/glue/SharedMemoryBasic_mach.mm, line 619
[Child 4856] WARNING: sending port failed (ipc/send) invalid destination port 10000003
: file /Users/mikedeboer/Projects/fx-team/ipc/glue/SharedMemoryBasic_mach.mm, line 613
Assertion failure: mDestroyed, at /Users/mikedeboer/Projects/fx-team/gfx/layers/IPDLActor.h:42
#03: event_base_loop[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x4da689]
#04: mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*)[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x4fa9e3]
#01: mozilla::layers::CompositableClient::DestroyIPDLActor(mozilla::layers::PCompositableChild*)[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0xde4e8f]
#04: base::MessagePumpLibevent::Run(base::MessagePump::Delegate*)[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x4c0095]
#05: MessageLoop::Run()[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x4be0dc]
#02: mozilla::layers::PImageBridgeChild::DeallocSubtree()[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x5d9e9b]
#05: MessageLoop::Run()[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x4be0dc]
mikedeboer:~/Projects/fx-team$ #06: nsThread::ThreadFunc(void*)[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0xe7339]
#03: mozilla::layers::PImageBridgeChild::OnChannelError()[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x5da26b]
#06: base::Thread::ThreadMain()[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x4c873d]
#07: _pt_root[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/libnss3.dylib +0x33bee9]
#04: mozilla::ipc::MessageChannel::NotifyMaybeChannelError()[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x4f7664]
#07: ThreadFunc(void*)[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x4c884a]
#08: _pthread_body[/usr/lib/system/libsystem_pthread.dylib +0x405a]
#05: MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&)[/Users/mikedeboer/Projects/fx-team/obj-x86_64-apple-darwin14.5.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x4be90c]
#08: _pthread_body[/usr/lib/system/libsystem_pthread.dylib +0x405a]
#09: _pthread_body[/usr/lib/system/libsystem_pthread.dylib +0x3fd7]
[Child 4856] ###!!! ABORT: corrupted actor state: file ./PCameras.cpp, line 34
Hit MOZ_CRASH() at /Users/mikedeboer/Projects/fx-team/memory/mozalloc/mozalloc_abort.cpp:33
Reporter | ||
Comment 2•9 years ago
|
||
Maire, this crash is blocking Hello to run properly in e10s mode. Is it possible for someone to take a look at this issue relatively soon?
Thanks!
Flags: needinfo?(mreavy)
Assignee | ||
Comment 3•9 years ago
|
||
Moved crashdump to an attachment
Assignee | ||
Updated•9 years ago
|
Crash Signature: Process: firefox-bin [4851]
Path: /Users/USER/*/NightlyDebug.app/Contents/MacOS/firefox-bin
Identifier: org.mozilla.nightlydebug
Version: 45.0a1 (4515.11.19)
Code Type: X86-64 (Native…
Comment 4•9 years ago
|
||
Looking at it now. I'll find the right owner today.
Assignee | ||
Comment 5•9 years ago
|
||
Mike - there are two dumps here; one that was in the Crash Signature (moved to attachment), which shows a MainThread nullptr crash in TCPSocketParent::RecvOpenBind (i.e. TCP TURN in use or being tried), and one above in IPC (Assertion failure: aManagees.Count() == 1). These might be related, but it would help to know exactly where each comes from.
I'll spin a mac build with that change
Flags: needinfo?(mdeboer)
Reporter | ||
Comment 6•9 years ago
|
||
Ah yes, I should've explained! The one in the crash signature field came from the OSX crash reporter dialog (native one) and the one in comment 1 is what I got on the command line.
Flags: needinfo?(mdeboer)
Updated•9 years ago
|
Assignee: nobody → rjesup
Updated•9 years ago
|
Reporter | ||
Comment 7•9 years ago
|
||
Were you able to reproduce this issue, Randell, or do you need more info from me?
Status: NEW → ASSIGNED
Flags: needinfo?(rjesup)
Assignee | ||
Comment 8•9 years ago
|
||
(In reply to Mike de Boer [:mikedeboer] from comment #7)
> Were you able to reproduce this issue, Randell, or do you need more info
> from me?
I have been able to repro, and have it caught in a debugger.
Flags: needinfo?(rjesup)
Reporter | ||
Comment 9•9 years ago
|
||
\o/ ;-)
Assignee | ||
Comment 10•9 years ago
|
||
Attachment #8694358 -
Flags: review?(wmccloskey)
Josh, my only concern here is with the other places where we use LoneManagedOrNull. This regressed here:
https://hg.mozilla.org/mozilla-central/rev/19c5de45a704
I wonder if we should fix all the TCP socket callsites? It sounds like Hello is the only desktop user of TCPSocket, and it maybe only cares about this one callsite. But we might have others down the road.
Flags: needinfo?(josh)
Comment 12•9 years ago
|
||
Yes, we absolutely should. I imagine this can bite b2g as well in the right situations.
Flags: needinfo?(josh)
Comment on attachment 8694358 [details] [diff] [review]
Don't assume a TCPSocket has only one managee
Review of attachment 8694358 [details] [diff] [review]:
-----------------------------------------------------------------
Randell, can you fix the rest of these? Maybe just add a helper function somewhere that doesn't assert. It could even be in ProtocolUtils.h.
Attachment #8694358 -
Flags: review?(wmccloskey)
Reporter | ||
Comment 14•9 years ago
|
||
Randell, when I apply the patch(es) and start a conversation with browser sharing, the crash is indeed gone :-)
However, the screen share stream is also terminated, so effectively instead of crashing it now stops sharing when the remote party joins the call.
Flags: needinfo?(rjesup)
Comment 15•9 years ago
|
||
(In reply to Mike de Boer [:mikedeboer] from comment #14)
> However, the screen share stream is also terminated, so effectively instead
> of crashing it now stops sharing when the remote party joins the call.
Just a thought. There's been some bugs reported by QA along these sorts of lines - where screen share doesn't show up when someone else joins a room. Might be worth trying commenting out the auto-screen share and hooking it into a button manually for a quick verification.
Assignee | ||
Comment 16•9 years ago
|
||
Mike: With this patch to stop it killing sharing in e10s mode, this patch queue works for me (I can share tabs in e10s mode no problem). Without this patch, I see what I think you were seeing (sharing stops ev5Cen before the remote connects, and an error in the Browser Console. Note: this just hacks the code out of there; you'll need to do whatever else might be needed to cleanly remove the Block-sharing-in-e10s stuff
Assignee | ||
Comment 17•9 years ago
|
||
patchset that works for me is:
unbitrotted bug 1154277 patch (https://bugzilla.mozilla.org/attachment.cgi?id=8694250) -- note, I unbitrotted it, but you should re-do it or check my unbitrot carefully.
The patch from this bug (8694358)
The don't-block-sharing patch (8700807)
Flags: needinfo?(standard8)
Flags: needinfo?(rjesup)
Flags: needinfo?(mdeboer)
Assignee | ||
Comment 18•9 years ago
|
||
FYI, with these patches, on ending calls in debug builds I'm seeing
Assertion failure: outputTrack->GetEnd() == GraphTimeToStreamTimeWithBlocking(interval.mStart) (Samples missing), at /home/jesup/src/mozilla/inbound/dom/media/TrackUnionStream.cpp:268
That would be a separate issue in MSG I believe. Not sure why it would be happening in Loop...
Flags: needinfo?(roc)
Flags: needinfo?(padenot)
Assignee | ||
Comment 19•9 years ago
|
||
as requested, made SingleManagedOrNull() that doesn't assert, and renamed LongManagedOrNull to LoneManagedOrNullAsserts, and converted all the TCPSocket uses to SingleManagedOrNull()
Attachment #8700866 -
Flags: review?(wmccloskey)
Assignee | ||
Updated•9 years ago
|
Attachment #8694358 -
Attachment is obsolete: true
Comment 20•9 years ago
|
||
Comment on attachment 8700866 [details] [diff] [review]
Don't assume a TCPSocket has only one managee
Review of attachment 8700866 [details] [diff] [review]:
-----------------------------------------------------------------
::: dom/network/TCPSocketParent.cpp
@@ +224,5 @@
> bool inBrowser = false;
> const PContentParent *content = Manager()->Manager();
> + // We don't use LoneManagedOrNullAsserts(content->ManagedPBrowserParent())
> + // here, because with Hello-in-Content causes 2 managees
> + // (Bug 1226200 and bug 1154277)
I don't think this comment adds much value.
Attachment #8700866 -
Flags: review?(wmccloskey) → review+
Comment 21•9 years ago
|
||
Comment 22•9 years ago
|
||
bugherder |
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
status-firefox46:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla46
Comment 23•9 years ago
|
||
I believe Mike has tested it and he's currently happy with the fix.
Flags: needinfo?(standard8)
Flags: needinfo?(mdeboer)
Comment 24•9 years ago
|
||
(In reply to Randell Jesup [:jesup] from comment #18)
> FYI, with these patches, on ending calls in debug builds I'm seeing
>
> Assertion failure: outputTrack->GetEnd() ==
> GraphTimeToStreamTimeWithBlocking(interval.mStart) (Samples missing), at
> /home/jesup/src/mozilla/inbound/dom/media/TrackUnionStream.cpp:268
>
> That would be a separate issue in MSG I believe. Not sure why it would be
> happening in Loop...
It's better to file this one separately I think.
Flags: needinfo?(padenot)
Assignee | ||
Comment 25•9 years ago
|
||
Let's see what happens when Mike really finishes his patch
Reporter | ||
Comment 26•9 years ago
|
||
Randell, can you request uplift for this patch to Fx 45?
Flags: needinfo?(rjesup)
Assignee | ||
Comment 27•9 years ago
|
||
Comment on attachment 8700866 [details] [diff] [review]
Don't assume a TCPSocket has only one managee
Approval Request Comment
[Feature/regressing bug #]: N/A
[User impact if declined]: Inability to sandbox Hello in the content process; also inability to share camera/mic with other in-content applications.
[Describe test coverage new/current, TreeHerder]: TCP TURN isn't testable in automation.
[Risks and why]: Low risk. Per BillM, the assumption here that there was a single manager is only really true for B2G, so asserting it isn't good. The function (SingleManagedOrNull) is identical to LoneManagedOrNullAsserts, except for the assertion.
[String/UUID change made/needed]:
Flags: needinfo?(rjesup)
Attachment #8700866 -
Flags: approval-mozilla-aurora?
Comment 28•9 years ago
|
||
Comment on attachment 8700866 [details] [diff] [review]
Don't assume a TCPSocket has only one managee
Fix a crash, taking it.
Attachment #8700866 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 29•9 years ago
|
||
bugherder uplift |
Flags: needinfo?(roc)
You need to log in
before you can comment on or make changes to this bug.
Description
•