The default bug view has changed. See this FAQ.

[FlyWeb] Tab crashes when publishServer-hosting-page receives an incoming connection.

RESOLVED FIXED in Firefox 51

Status

()

Core
DOM: Flyweb
RESOLVED FIXED
7 months ago
3 months ago

People

(Reporter: djvj, Assigned: djvj)

Tracking

(Blocks: 1 bug, {crash, regression})

Trunk
mozilla51
crash, regression
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox50 unaffected, firefox51 fixed)

Details

(crash signature)

Attachments

(1 attachment)

(Assignee)

Description

7 months ago
This was reported by user Endy in a comment in response to the hacks.mozilla.org blog post (https://hacks.mozilla.org/2016/09/flyweb-pure-web-cross-device-interaction/).  Making a bug to track it.

Relevant info from comment:
The crash happens after a client connects. The client receives an initial response (the pedals for the GP example or the photo wall ui for the other example appears in the client), then the host crashes. The host browser crashed on Win7 and Ubuntu 14.04 as well after connecting from either a separate device (Nightly on Android 5.1.1) or from the same device (using another profile and the –no-remote switch).

Comment 1

7 months ago
Hi!

I'm the one who wrote the comment about crashing. Since then, by stepping through the code in the debugger, I figured out the crash happens when the host/server calls event.accept() inside the onwebsocket listener of the FlyWeb server object. That is as far as I could get on my own.
(Assignee)

Comment 2

7 months ago
Thanks balazs,

Yeah, I narrowed down the crash to that as well.  Thanks for helping with this!

It seems something changed recently in how the network IPC code works, and we're getting a nullptr when we try to access the network IPC singleton from the parent process.  Trying to figure out how to fix it now..
(Assignee)

Comment 3

7 months ago
Hi Nick,

I need some urgent help with figuring out what's going wrong here.  The offending code is here:
https://dxr.mozilla.org/mozilla-central/source/dom/flyweb/FlyWebPublishedServer.cpp#546

  if (type.EqualsLiteral("websocket")) {
    RefPtr<InternalRequest> request =
      static_cast<FlyWebWebSocketEvent*>(aEvent)->Request()->GetInternalRequest();
    uint64_t id = mNextRequestId++;
    mPendingRequests.Put(id, request);

    RefPtr<TransportProviderParent> provider =
      static_cast<TransportProviderParent*>(
        mozilla::net::gNeckoParent->SendPTransportProviderConstructor());

    IPCInternalRequest ipcReq;
    request->ToIPC(&ipcReq);
    Unused << SendWebSocketRequest(ipcReq, id, provider);

    mPendingTransportProviders.Put(id, provider.forget());
    return NS_OK;
  }

Turns out mozilla::net::gNeckoParent is null in the above code.  This code is in a method of "FlyWebPublishedServerParent", which implements "PFlyWebPublishedServerParent", which is an ipdl parent object managed by PContent.

So my question is.. why is the PNeckoParent singleton pointer null?  And how do I get a proper handle to it (can I just instantiate it and use it?)
Flags: needinfo?(hurley)
(Assignee)

Comment 4

7 months ago
Tried a fix suggested by mrbkap, and ran into this other crash.  Documenting it here so that it doesn't get lost:

Assertion failure: mRawPtr != 0 (You can't dereference a NULL nsCOMPtr with operator->().), at /Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/include/nsCOMPtr.h:747
#01: nsCOMPtr<nsILoadInfo>::operator->() const[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x2ecbd4]
#02: mozilla::net::WebSocketChannelParent::RecvAsyncOpen(mozilla::ipc::OptionalURIParams const&, nsCString const&, unsigned long long const&, nsCString const&, bool const&, unsigned int const&, bool const&, unsigned int const&, bool const&, mozilla::net::Optio[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x9676c3]
#03: mozilla::net::PWebSocketParent::OnMessageReceived(IPC::Message const&)[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x10f2f93]
#04: mozilla::dom::PContentParent::OnMessageReceived(IPC::Message const&)[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x1378945]
#05: mozilla::ipc::MessageChannel::DispatchAsyncMessage(IPC::Message const&)[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0xbabee9]
#06: mozilla::ipc::MessageChannel::DispatchMessage(IPC::Message&&)[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0xbaacd4]
#07: mozilla::ipc::MessageChannel::OnMaybeDequeueOne()[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0xba3c92]
#08: decltype(*(fp).*fp0(Get<>(fp1).PassAsParameter())) mozilla::detail::RunnableMethodArguments<>::applyImpl<mozilla::ipc::MessageChannel, bool (mozilla::ipc::MessageChannel::*)()>(mozilla::ipc::MessageChannel*, bool (mozilla::ipc::MessageChannel::*)(), mozil[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0xbd089a]
#09: _ZN7mozilla6detail23RunnableMethodArgumentsIJEE5applyINS_3ipc14MessageChannelEMS5_FbvEEEDTcl9applyImplfp_fp0_dtdefpT10mArgumentscvNS_13IndexSequenceIJEEE_EEEPT_T0_[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0xbd0809]
#10: mozilla::detail::RunnableMethodImpl<bool (mozilla::ipc::MessageChannel::*)(), false, true>::Run()[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0xbd05d2]
#11: mozilla::ipc::MessageChannel::RefCountedTask::Run()[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0xbcd662]
#12: mozilla::ipc::MessageChannel::DequeueTask::Run()[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0xbcd3f8]
#13: nsThread::ProcessNextEvent(bool, bool*)[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x1c2c19]
#14: NS_ProcessPendingEvents(nsIThread*, unsigned int)[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x24e5ec]
#15: nsBaseAppShell::NativeEventCallback()[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x4ac09ae]
#16: nsAppShell::ProcessGeckoEvents(void*)[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x4b58722]
#17: __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__[/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation +0xaa881]
#18: __CFRunLoopDoSources0[/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation +0x89fbc]
#19: __CFRunLoopRun[/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation +0x894df]
#20: CFRunLoopRunSpecific[/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation +0x88ed8]
#21: RunCurrentEventLoopInMode[/System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox +0x30935]
#22: ReceiveNextEventCommon[/System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox +0x3076f]
#23: _BlockUntilNextEventMatchingListInModeWithFilter[/System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox +0x305af]
#24: _DPSNextEvent[/System/Library/Frameworks/AppKit.framework/Versions/C/AppKit +0x48df6]
#25: -[NSApplication _nextEventMatchingEventMask:untilDate:inMode:dequeue:][/System/Library/Frameworks/AppKit.framework/Versions/C/AppKit +0x48226]
#26: -[GeckoNSApplication nextEventMatchingMask:untilDate:inMode:dequeue:][/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x4b572b4]
#27: -[NSApplication run][/System/Library/Frameworks/AppKit.framework/Versions/C/AppKit +0x3cd80]
#28: nsAppShell::Run()[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x4b590cc]
#29: nsAppStartup::Run()[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x5f5172b]
#30: XREMain::XRE_mainRun()[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x605a06d]
#31: XREMain::XRE_main(int, char**, nsXREAppData const*)[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x605ab01]
#32: XRE_main[/Users/kannanvijayan/checkouts/mozilla-inbound/BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/XUL +0x605af62]
#33: do_main(int, char**, char**, nsIFile*)[/Users/kannanvijayan/checkouts/mozilla-inbound/./BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/firefox +0x1dea]
#34: main[/Users/kannanvijayan/checkouts/mozilla-inbound/./BUILD-DEBUG64/dist/NightlyDebug.app/Contents/MacOS/firefox +0x12f2]
Process 14144 stopped
* thread #1: tid = 0x2b5223, 0x0000000103121be4 XUL`nsCOMPtr<nsILoadInfo>::operator->(this=0x00007fff5fbf9908) const + 100 at nsCOMPtr.h:746, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x0000000103121be4 XUL`nsCOMPtr<nsILoadInfo>::operator->(this=0x00007fff5fbf9908) const + 100 at nsCOMPtr.h:746
   743
   744 	  T* operator->() const MOZ_NO_ADDREF_RELEASE_ON_RETURN
   745 	  {
-> 746 	    MOZ_ASSERT(mRawPtr != 0,
   747 	               "You can't dereference a NULL nsCOMPtr with operator->().");
   748 	    return get();
   749 	  }
--DOMWINDOW == 2 (0x126f2f800) [pid = 14149] [serial = 2] [outer = 0x0] [url = about:blank]
(Assignee)

Comment 5

7 months ago
Created attachment 8789422 [details] [diff] [review]
fix-websocket-crashes.patch
Attachment #8789422 - Flags: review?(tihuang)
(Assignee)

Comment 6

7 months ago
Comment on attachment 8789422 [details] [diff] [review]
fix-websocket-crashes.patch

Review of attachment 8789422 [details] [diff] [review]:
-----------------------------------------------------------------

Tim: Can you look over these changes to WebSocketChannelParent::RecvAsyncOpen ?  The last patch you pushed for that method broke some FlyWeb code.  Namely, when the WebSocketChannelParent is representing a server-side websocket (something that's only used in FlyWeb currently), then the incoming |loadInfoArgs| is void_t(), and produces a null |nsILoadInfo|, which crashes when you try to get origin attributes from it.  This fixes the case where the |nsILoadInfo| is null by using the old |GetAppId()| call in that circumstance.

Bill: Can I assume that calling PContentParent::ManagedPNeckoParent(nsTArray<PNeckoParent*>&) will always yield a single PNeckoParent pointer in the array?  Is there any reasonable circumstance where I'll get no PNeckoParents back?  Or more than one?
Attachment #8789422 - Flags: review?(wmccloskey)

Comment 7

7 months ago
Comment on attachment 8789422 [details] [diff] [review]
fix-websocket-crashes.patch

I can't review this since I'm not a Necko peer. Forwarding the review request.
Attachment #8789422 - Flags: review?(tihuang) → review?(honzab.moz)
Just adding a note for tracking purposes that this regressions seems to have stemmed from Bug 1291652.
See Also: → bug 1291652
Comment on attachment 8789422 [details] [diff] [review]
fix-websocket-crashes.patch

Review of attachment 8789422 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/websocket/WebSocketChannelParent.cpp
@@ +92,5 @@
> +    }
> +
> +    appId = attrs.mAppId;
> +  } else {
> +    appId = GetAppId();

If we don't have loadInfo from the child, is it really possible that we get something else than NECKO_UNKNOWN_APP_ID from GetAppId()? I.e. shouldn't we simply assign NECKO_UNKNOWN_APP_ID to appId?
Attachment #8789422 - Flags: feedback+
(Assignee)

Comment 10

7 months ago
I think that is probably reasonable, but I'm not sure what having that APP_ID would imply in the handling of the connection.  The loadInfo is empty in this case because this AsyncOpen is NOT a load initiated by the browser, but rather an incoming connection targeting the WebSocket server hosted by the browser.

So the APP id shouldn't matter in this case, as there's no actual cross-origin stuff to worry about, as no resources are being accessed by the page in question.

Anyway, I know of no circumstance where we would not get NECKO_UNKNOWN_APP_ID.  Thanks for the feedback.
Comment on attachment 8789422 [details] [diff] [review]
fix-websocket-crashes.patch

Review of attachment 8789422 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/flyweb/FlyWebPublishedServer.cpp
@@ +520,5 @@
>      mPendingRequests.Put(id, request);
>  
> +    nsTArray<PNeckoParent*> neckoParents;
> +    Manager()->ManagedPNeckoParent(neckoParents);
> +    if (neckoParents.Length() != 1) {

> Bill: Can I assume that calling PContentParent::ManagedPNeckoParent(nsTArray<PNeckoParent*>&)
> will always yield a single PNeckoParent pointer in the array?  Is there any reasonable
> circumstance where I'll get no PNeckoParents back?  Or more than one?

We will never have more than one. I guess it's possible that there can be zero if no one has ever used Necko. It seems to be lazily initialized as far as I can tell. So maybe this could happen really early in startup of the content process.

I guess I'm okay with doing this and seeing if we get crash reports back.

We do have this thing that you should use instead though:
http://searchfox.org/mozilla-central/rev/124c120ce9d7e8407c9ad330a9d6b1c4ad432637/ipc/glue/ProtocolUtils.h#592
Attachment #8789422 - Flags: review?(wmccloskey) → review+
Comment on attachment 8789422 [details] [diff] [review]
fix-websocket-crashes.patch

Review of attachment 8789422 [details] [diff] [review]:
-----------------------------------------------------------------

r+ for WebSocketChannelParent part if you add the comment

::: netwerk/protocol/websocket/WebSocketChannelParent.cpp
@@ +92,5 @@
> +    }
> +
> +    appId = attrs.mAppId;
> +  } else {
> +    appId = GetAppId();

Please add a comment explaining when we hit this code path.
Attachment #8789422 - Flags: review?(honzab.moz) → review+

Comment 13

7 months ago
Pushed by kvijayan@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/6f62ac027526
Make FlyWeb not use gNeckoParent.  Fix WebSocketChannel handling of incoming requests to server-websocket. r=billm r=michal
Flags: needinfo?(hurley)
Blocks: 1291652
See Also: bug 1291652
Duplicate of this bug: 1301757
Crash Signature: [@ mozilla::net::WebSocketChannelParent::RecvAsyncOpen]
status-firefox50: --- → unaffected
status-firefox51: --- → affected
Keywords: crash, regression

Comment 15

7 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/6f62ac027526
Status: NEW → RESOLVED
Last Resolved: 7 months ago
status-firefox51: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla51
Product: Toolkit → Core
Assignee: nobody → kvijayan
Version: unspecified → Trunk
You need to log in before you can comment on or make changes to this bug.