Closed Bug 1163735 Opened 10 years ago Closed 10 years ago

crash in mozalloc_abort(char const* const) | NS_DebugBreak | mozilla::net::PNeckoChild::Write(mozilla::dom::PBrowserChild*, IPC::Message*, bool)

Categories

(Core :: DOM: Core & HTML, defect)

37 Branch
Unspecified
Windows NT
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla43
Tracking Status
e10s ? ---
firefox40 - wontfix
firefox41 --- affected
firefox42 + fixed
firefox43 + fixed

People

(Reporter: lizzard, Assigned: billm)

Details

(Keywords: crash, topcrash-win)

Crash Data

Attachments

(2 files)

This bug was filed from the Socorro interface and is report bp-b9d1044e-e0a1-4120-ab33-a3c9c2150511. ============================================================= This is currently the #7 topcrash in Firefox 40 and is showing up on the explosiveness report for 40. There are a few crashes in 39 but at a much lower volume. Crashing thread: 0 mozglue.dll mozalloc_abort(char const* const) memory/mozalloc/mozalloc_abort.cpp 1 xul.dll NS_DebugBreak xpcom/base/nsDebugImpl.cpp 2 xul.dll mozilla::net::PNeckoChild::Write(mozilla::dom::PBrowserChild*, IPC::Message*, bool) obj-firefox/ipc/ipdl/PNeckoChild.cpp 3 xul.dll mozilla::net::PNeckoChild::Write(mozilla::dom::PBrowserOrId const&, IPC::Message*) obj-firefox/ipc/ipdl/PNeckoChild.cpp 4 xul.dll mozilla::net::PNeckoChild::SendPHttpChannelConstructor(mozilla::net::PHttpChannelChild*, mozilla::dom::PBrowserOrId const&, IPC::SerializedLoadContext const&, mozilla::net::HttpChannelCreationArgs const&) obj-firefox/ipc/ipdl/PNeckoChild.cpp 5 xul.dll mozilla::net::HttpChannelChild::ContinueAsyncOpen() netwerk/protocol/http/HttpChannelChild.cpp 6 xul.dll mozilla::net::HttpChannelChild::AsyncOpen(nsIStreamListener*, nsISupports*) netwerk/protocol/http/HttpChannelChild.cpp 7 xul.dll nsXMLHttpRequest::Send(nsIVariant*, mozilla::dom::Nullable<nsXMLHttpRequest::RequestBody> const&) dom/base/nsXMLHttpRequest.cpp 8 xul.dll nsXMLHttpRequest::Send(nsIVariant*) dom/base/nsXMLHttpRequest.cpp 9 xul.dll `anonymous namespace'::SendRunnable::MainThreadRun() dom/workers/XMLHttpRequest.cpp 10 xul.dll `anonymous namespace'::WorkerThreadProxySyncRunnable::Run() dom/workers/XMLHttpRequest.cpp 11 xul.dll nsThread::ProcessNextEvent(bool, bool*) xpcom/threads/nsThread.cpp 12 xul.dll mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) ipc/glue/MessagePump.cpp 13 xul.dll mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate*) ipc/glue/MessagePump.cpp 14 xul.dll MessageLoop::RunHandler() ipc/chromium/src/base/message_loop.cc
Adding a tracking flag for FF40 as KaiRo mentioned this as a top issue on dev edition/Aurora.
Jason, can you help on this (or assign this to someone)? This is a top crash.
Flags: needinfo?(jduell.mcbugs)
This is deep in the guts of IPDL serialization. That doesn't mean it's not somehow a necko bug, but looking over recent changes (since the bug started showing up, in 6/8/15 builds) I don't see any necko changes to PNecko or related IPDL files in /netwerk. So I'm punting this over to :billm to see if he knows of any IPC changes that might be the cause here.
Component: Networking → IPC
Flags: needinfo?(jduell.mcbugs) → needinfo?(wmccloskey)
Liz, does this have a nightly regression range? The two asserts in PNeckoChild::Write are "NULL actor value passed to non-nullable param" and "actor has been |delete|d". So I'd bet that HttpChannelChild::ContinueAsyncOpen is passing a null or deleted PBrowserChild here: http://hg.mozilla.org/mozilla-central/annotate/cef11c3e86c3/netwerk/protocol/http/HttpChannelChild.cpp#l1690 I don't think this is an IPC issue. There may be an ordering issue which exposed this bug more recently. Bumping back to necko. Liz, is there a clear nightly regression range?
Component: IPC → Networking
Flags: needinfo?(wmccloskey) → needinfo?(lhenry)
Florin can your team look for a regression range? I'm out sick today and am not sure how long it will be for.
Flags: needinfo?(lhenry) → needinfo?(florin.mezei)
I don't think we can provide a clear regression range without reliable steps to reproduce. Socorro indicates this may have started somewhere on March 31st/April 1st (https://hg.mozilla.org/mozilla-central/pushloghtml?startdate=2015-03-30&enddate=2015-04-02), but I'm not very sure of this and there doesn't seem to be a clear point at which this spiked significantly.
Flags: needinfo?(florin.mezei)
With the exception of 1 report, in the last month I only see reports on Aurora and Nightly. Can this be e10s related or related to some other config that is enabled only on these branches? Given that this doesn't seem to affect Beta, I am marking this as wontfix for 40 and dropping tracking.
[Tracking Requested - why for this release]: we should track this for 42 as long as it remains an e10s rollout candidate.
Thanks Jim. It is no longer a top crash but let's track it for now.
(In reply to Liz Henry (:lizzard) from comment #9) > Thanks Jim. It is no longer a top crash but let's track it for now. It's baaaack! :) #1 top crasher today in 43.
tracking-e10s: --- → ?
Flags: needinfo?(wmccloskey)
Based on this IRC conversation with billm, I'm reclassifying this as a DOM bug. The issue does not appear to be with getting a null PBrowser--the "union PBrowserOrId" type explicitly handles nullable PBrowsers. Rather billm thinks that Manager->GetBrowserOrId() is returning a pointer to a deleted PBrowser: https://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/HttpChannelChild.cpp?from=HttpChannelChild.cpp&offset=0#1711 Neither of us know how that could be happening, but we're hoping smaug may have ideas.
Flags: needinfo?(bugs)
Hmm, I think smaug is on PTO. Maybe khuey?
Flags: needinfo?(khuey)
Component: Networking → DOM
I think the bug is in WorkerPrivate::InterfaceRequestor. It maintains a list of weak pointers to TabChilds, but the TabChild object still being alive is a distinct state from the TabChild actor still being alive, and it doesn't check the latter. GetAnyLiveTabChild needs to query IsDestroyed before handing out a TabChild. I'll review a patch.
Flags: needinfo?(khuey)
Attached patch patchSplinter Review
Thanks Kyle. Sorry about the ugly cast. It just didn't feel right to add this method to nsITabChild. https://treeherder.mozilla.org/#/jobs?repo=try&revision=bf0ac582f4b6
Flags: needinfo?(wmccloskey)
Attachment #8649596 - Flags: review?(khuey)
Comment on attachment 8649596 [details] [diff] [review] patch Review of attachment 8649596 [details] [diff] [review]: ----------------------------------------------------------------- r=me with a better commit message
Attachment #8649596 - Flags: review?(khuey) → review+
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla43
hey bill, can we uplift this to aurora?
Assignee: nobody → wmccloskey
Flags: needinfo?(wmccloskey)
Comment on attachment 8649596 [details] [diff] [review] patch Approval Request Comment [Feature/regressing bug #]: unknown [User impact if declined]: topcrash with e10s [Describe test coverage new/current, TreeHerder]: on m-c for a few days [Risks and why]: Low, but there is a chance of a regression. [String/UUID change made/needed]: none
Flags: needinfo?(wmccloskey)
Attachment #8649596 - Flags: approval-mozilla-aurora?
Comment on attachment 8649596 [details] [diff] [review] patch Looks good to me, let's uplift to aurora.
Attachment #8649596 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: