Closed Bug 1301091 Opened 8 years ago Closed 8 years ago

[FlyWeb] Tab crashes when publishServer-hosting-page receives an incoming connection.

Categories

(Core Graveyard :: DOM: Flyweb, defect)

defect
Not set
normal

Tracking

(firefox50 unaffected, firefox51 fixed)

RESOLVED FIXED
mozilla51
Tracking Status
firefox50 --- unaffected
firefox51 --- fixed

People

(Reporter: djvj, Assigned: djvj)

References

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

This was reported by user Endy in a comment in response to the hacks.mozilla.org blog post (https://hacks.mozilla.org/2016/09/flyweb-pure-web-cross-device-interaction/).  Making a bug to track it.

Relevant info from comment:
The crash happens after a client connects. The client receives an initial response (the pedals for the GP example or the photo wall ui for the other example appears in the client), then the host crashes. The host browser crashed on Win7 and Ubuntu 14.04 as well after connecting from either a separate device (Nightly on Android 5.1.1) or from the same device (using another profile and the –no-remote switch).
Hi!

I'm the one who wrote the comment about crashing. Since then, by stepping through the code in the debugger, I figured out the crash happens when the host/server calls event.accept() inside the onwebsocket listener of the FlyWeb server object. That is as far as I could get on my own.
Thanks balazs,

Yeah, I narrowed down the crash to that as well.  Thanks for helping with this!

It seems something changed recently in how the network IPC code works, and we're getting a nullptr when we try to access the network IPC singleton from the parent process.  Trying to figure out how to fix it now..
Hi Nick,

I need some urgent help with figuring out what's going wrong here.  The offending code is here:
https://dxr.mozilla.org/mozilla-central/source/dom/flyweb/FlyWebPublishedServer.cpp#546

  if (type.EqualsLiteral("websocket")) {
    RefPtr<InternalRequest> request =
      static_cast<FlyWebWebSocketEvent*>(aEvent)->Request()->GetInternalRequest();
    uint64_t id = mNextRequestId++;
    mPendingRequests.Put(id, request);

    RefPtr<TransportProviderParent> provider =
      static_cast<TransportProviderParent*>(
        mozilla::net::gNeckoParent->SendPTransportProviderConstructor());

    IPCInternalRequest ipcReq;
    request->ToIPC(&ipcReq);
    Unused << SendWebSocketRequest(ipcReq, id, provider);

    mPendingTransportProviders.Put(id, provider.forget());
    return NS_OK;
  }

Turns out mozilla::net::gNeckoParent is null in the above code.  This code is in a method of "FlyWebPublishedServerParent", which implements "PFlyWebPublishedServerParent", which is an ipdl parent object managed by PContent.

So my question is.. why is the PNeckoParent singleton pointer null?  And how do I get a proper handle to it (can I just instantiate it and use it?)
Flags: needinfo?(hurley)
Tried a fix suggested by mrbkap, and ran into this other crash.  Documenting it here so that it doesn't get lost:

Assertion failure: mRawPtr != 0 (You can't dereference a NULL nsCOMPtr with operator->().), at /Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/include/nsCOMPtr.h:747
#01: nsCOMPtr<nsILoadInfo>::operator->() const[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x2ecbd4]
#02: mozilla::net::WebSocketChannelParent::RecvAsyncOpen(mozilla::ipc::OptionalURIParams const&, nsCString const&, unsigned long long const&, nsCString const&, bool const&, unsigned int const&, bool const&, unsigned int const&, bool const&, mozilla::net::Optio[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x9676c3]
#03: mozilla::net::PWebSocketParent::OnMessageReceived(IPC::Message const&)[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x10f2f93]
#04: mozilla::dom::PContentParent::OnMessageReceived(IPC::Message const&)[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x1378945]
#05: mozilla::ipc::MessageChannel::DispatchAsyncMessage(IPC::Message const&)[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0xbabee9]
#06: mozilla::ipc::MessageChannel::DispatchMessage(IPC::Message&&)[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0xbaacd4]
#07: mozilla::ipc::MessageChannel::OnMaybeDequeueOne()[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0xba3c92]
#08: decltype(*(fp).*fp0(Get<>(fp1).PassAsParameter())) mozilla::detail::RunnableMethodArguments<>::applyImpl<mozilla::ipc::MessageChannel, bool (mozilla::ipc::MessageChannel::*)()>(mozilla::ipc::MessageChannel*, bool (mozilla::ipc::MessageChannel::*)(), mozil[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0xbd089a]
#09: _ZN7mozilla6detail23RunnableMethodArgumentsIJEE5applyINS_3ipc14MessageChannelEMS5_FbvEEEDTcl9applyImplfp_fp0_dtdefpT10mArgumentscvNS_13IndexSequenceIJEEE_EEEPT_T0_[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0xbd0809]
#10: mozilla::detail::RunnableMethodImpl<bool (mozilla::ipc::MessageChannel::*)(), false, true>::Run()[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0xbd05d2]
#11: mozilla::ipc::MessageChannel::RefCountedTask::Run()[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0xbcd662]
#12: mozilla::ipc::MessageChannel::DequeueTask::Run()[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0xbcd3f8]
#13: nsThread::ProcessNextEvent(bool, bool*)[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x1c2c19]
#14: NS_ProcessPendingEvents(nsIThread*, unsigned int)[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x24e5ec]
#15: nsBaseAppShell::NativeEventCallback()[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x4ac09ae]
#16: nsAppShell::ProcessGeckoEvents(void*)[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x4b58722]
#17: __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__[/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation +0xaa881]
#18: __CFRunLoopDoSources0[/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation +0x89fbc]
#19: __CFRunLoopRun[/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation +0x894df]
#20: CFRunLoopRunSpecific[/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation +0x88ed8]
#21: RunCurrentEventLoopInMode[/System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox +0x30935]
#22: ReceiveNextEventCommon[/System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox +0x3076f]
#23: _BlockUntilNextEventMatchingListInModeWithFilter[/System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox +0x305af]
#24: _DPSNextEvent[/System/Library/Frameworks/AppKit.framework/Versions/C/AppKit +0x48df6]
#25: -[NSApplication _nextEventMatchingEventMask:untilDate:inMode:dequeue:][/System/Library/Frameworks/AppKit.framework/Versions/C/AppKit +0x48226]
#26: -[GeckoNSApplication nextEventMatchingMask:untilDate:inMode:dequeue:][/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x4b572b4]
#27: -[NSApplication run][/System/Library/Frameworks/AppKit.framework/Versions/C/AppKit +0x3cd80]
#28: nsAppShell::Run()[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x4b590cc]
#29: nsAppStartup::Run()[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x5f5172b]
#30: XREMain::XRE_mainRun()[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x605a06d]
#31: XREMain::XRE_main(int, char**, nsXREAppData const*)[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x605ab01]
#32: XRE_main[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x605af62]
#33: do_main(int, char**, char**, nsIFile*)[/Users/kannanvijayan/checkouts/mozilla-inbound/./BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/firefox +0x1dea]
#34: main[/Users/kannanvijayan/checkouts/mozilla-inbound/./BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/firefox +0x12f2]
Process 14144 stopped
* thread #1: tid = 0x2b5223, 0x0000000103121be4 XUL`nsCOMPtr<nsILoadInfo>::operator->(this=0x00007fff5fbf9908) const + 100 at nsCOMPtr.h:746, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x0000000103121be4 XUL`nsCOMPtr<nsILoadInfo>::operator->(this=0x00007fff5fbf9908) const + 100 at nsCOMPtr.h:746
   743
   744 	  T* operator->() const MOZ_NO_ADDREF_RELEASE_ON_RETURN
   745 	  {
-> 746 	    MOZ_ASSERT(mRawPtr != 0,
   747 	               "You can't dereference a NULL nsCOMPtr with operator->().");
   748 	    return get();
   749 	  }
--DOMWINDOW == 2 (0x126f2f800) [pid = 14149] [serial = 2] [outer = 0x0] [url = about:blank]
Attachment #8789422 - Flags: review?(tihuang)
Comment on attachment 8789422 [details] [diff] [review]
fix-websocket-crashes.patch

Review of attachment 8789422 [details] [diff] [review]:
-----------------------------------------------------------------

Tim: Can you look over these changes to WebSocketChannelParent::RecvAsyncOpen ?  The last patch you pushed for that method broke some FlyWeb code.  Namely, when the WebSocketChannelParent is representing a server-side websocket (something that's only used in FlyWeb currently), then the incoming |loadInfoArgs| is void_t(), and produces a null |nsILoadInfo|, which crashes when you try to get origin attributes from it.  This fixes the case where the |nsILoadInfo| is null by using the old |GetAppId()| call in that circumstance.

Bill: Can I assume that calling PContentParent::ManagedPNeckoParent(nsTArray<PNeckoParent*>&) will always yield a single PNeckoParent pointer in the array?  Is there any reasonable circumstance where I'll get no PNeckoParents back?  Or more than one?
Attachment #8789422 - Flags: review?(wmccloskey)
Comment on attachment 8789422 [details] [diff] [review]
fix-websocket-crashes.patch

I can't review this since I'm not a Necko peer. Forwarding the review request.
Attachment #8789422 - Flags: review?(tihuang) → review?(honzab.moz)
Just adding a note for tracking purposes that this regressions seems to have stemmed from Bug 1291652.
See Also: → 1291652
Comment on attachment 8789422 [details] [diff] [review]
fix-websocket-crashes.patch

Review of attachment 8789422 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/websocket/WebSocketChannelParent.cpp
@@ +92,5 @@
> +    }
> +
> +    appId = attrs.mAppId;
> +  } else {
> +    appId = GetAppId();

If we don't have loadInfo from the child, is it really possible that we get something else than NECKO_UNKNOWN_APP_ID from GetAppId()? I.e. shouldn't we simply assign NECKO_UNKNOWN_APP_ID to appId?
Attachment #8789422 - Flags: feedback+
I think that is probably reasonable, but I'm not sure what having that APP_ID would imply in the handling of the connection.  The loadInfo is empty in this case because this AsyncOpen is NOT a load initiated by the browser, but rather an incoming connection targeting the WebSocket server hosted by the browser.

So the APP id shouldn't matter in this case, as there's no actual cross-origin stuff to worry about, as no resources are being accessed by the page in question.

Anyway, I know of no circumstance where we would not get NECKO_UNKNOWN_APP_ID.  Thanks for the feedback.
Comment on attachment 8789422 [details] [diff] [review]
fix-websocket-crashes.patch

Review of attachment 8789422 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/flyweb/FlyWebPublishedServer.cpp
@@ +520,5 @@
>      mPendingRequests.Put(id, request);
>  
> +    nsTArray<PNeckoParent*> neckoParents;
> +    Manager()->ManagedPNeckoParent(neckoParents);
> +    if (neckoParents.Length() != 1) {

> Bill: Can I assume that calling PContentParent::ManagedPNeckoParent(nsTArray<PNeckoParent*>&)
> will always yield a single PNeckoParent pointer in the array?  Is there any reasonable
> circumstance where I'll get no PNeckoParents back?  Or more than one?

We will never have more than one. I guess it's possible that there can be zero if no one has ever used Necko. It seems to be lazily initialized as far as I can tell. So maybe this could happen really early in startup of the content process.

I guess I'm okay with doing this and seeing if we get crash reports back.

We do have this thing that you should use instead though:
http://searchfox.org/mozilla-central/rev/124c120ce9d7e8407c9ad330a9d6b1c4ad432637/ipc/glue/ProtocolUtils.h#592
Attachment #8789422 - Flags: review?(wmccloskey) → review+
Comment on attachment 8789422 [details] [diff] [review]
fix-websocket-crashes.patch

Review of attachment 8789422 [details] [diff] [review]:
-----------------------------------------------------------------

r+ for WebSocketChannelParent part if you add the comment

::: netwerk/protocol/websocket/WebSocketChannelParent.cpp
@@ +92,5 @@
> +    }
> +
> +    appId = attrs.mAppId;
> +  } else {
> +    appId = GetAppId();

Please add a comment explaining when we hit this code path.
Attachment #8789422 - Flags: review?(honzab.moz) → review+
Pushed by kvijayan@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/6f62ac027526
Make FlyWeb not use gNeckoParent.  Fix WebSocketChannel handling of incoming requests to server-websocket. r=billm r=michal
Flags: needinfo?(hurley)
Blocks: 1291652
See Also: 1291652
Crash Signature: [@ mozilla::net::WebSocketChannelParent::RecvAsyncOpen]
https://hg.mozilla.org/mozilla-central/rev/6f62ac027526
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla51
Product: Toolkit → Core
Assignee: nobody → kvijayan
Version: unspecified → Trunk
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: