Closed Bug 1325918 Opened 8 years ago Closed 5 years ago

Crash in content processes in BackgroundChildImpl::ProcessingError with message "MsgDropped: Channel error: cannot send/recv"

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla80

Tracking Flags:

Tracking

Status

firefox-esr45

---

wontfix

firefox51

---

wontfix

firefox52

---

wontfix

firefox-esr68

---

wontfix

firefox-esr78

---

wontfix

firefox53

---

wontfix

firefox54

---

wontfix

firefox78

---

wontfix

firefox79

---

wontfix

firefox80

---

fixed

People

(Reporter: ting, Assigned: jld)

References

Details

(Keywords: crash, Whiteboard: [geckoview][fenix:p1])

Crash Data

Attachments

(1 file)

Bug 1325918 - Ignore MsgDropped errors in BackgroundChildImpl. 5 years ago Jed Davis [:jld] ⟨⏰\|UTC-8⟩ ⟦he/him⟧ 47 bytes, text/x-phabricator-request		Details \| Review

Ting-Yu Chou [:ting] (away)

Reporter

Description

•

8 years ago

This bug was filed from the Socorro interface and is report bp-3f1dcb0e-0558-4e15-9f3a-7b5992161226. ============================================================= Top #7 of Nightly 20161225030206 on Windows, 5 crashes from 2 installations. The error message is: "Channel error: cannot send/recv." The crash stack: xul.dll!mozilla::ipc::BackgroundChildImpl::ProcessingError(mozilla::ipc::HasResultCodes::Result aCode, const char * aReason) Line 141 C++ xul.dll!mozilla::ipc::MessageChannel::ReportConnectionError(const char * aChannelName, IPC::Message * aMsg) Line 2085 C++ xul.dll!mozilla::ipc::MessageChannel::Send(IPC::Message * aMsg) Line 789 C++ xul.dll!mozilla::ipc::PBackgroundChild::SendPServiceWorkerManagerConstructor(mozilla::dom::PServiceWorkerManagerChild * actor) Line 680 C++ xul.dll!mozilla::dom::workers::ServiceWorkerManager::ActorCreated(mozilla::ipc::PBackgroundChild * aActor) Line 1738 C++ xul.dll!`anonymous namespace'::ChildImpl::OpenChildProcessActorRunnable::Run() Line 1883 C++ xul.dll!nsThread::ProcessNextEvent(bool aMayWait, bool * aResult) Line 1219 C++ xul.dll!mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate * aDelegate) Line 96 C++ xul.dll!mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate * aDelegate) Line 301 C++ xul.dll!MessageLoop::RunHandler() Line 232 C++ xul.dll!MessageLoop::Run() Line 212 C++ xul.dll!nsBaseAppShell::Run() Line 158 C++ xul.dll!nsAppShell::Run() Line 262 C++ xul.dll!XRE_RunAppShell() Line 924 C++ xul.dll!mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate * aDelegate) Line 278 C++ xul.dll!MessageLoop::RunHandler() Line 232 C++ xul.dll!MessageLoop::Run() Line 212 C++ xul.dll!XRE_InitChildProcess(int aArgc, char * * aArgv, const XREChildData * aChildData) Line 760 C++

Hsin-Yi Tsai (she/her) [:hsinyi]

Comment 1

•

8 years ago

Hello Ben, it's a significant crash, could you please help this out?

Flags: needinfo?(bhsu)

Priority: -- → P1

Ben Hsu [:HoPang]

Comment 2

•

8 years ago

Sure!

Assignee: nobody → bhsu

Flags: needinfo?(bhsu)

Ben Hsu [:HoPang]

Comment 3

•

8 years ago

After discussing with some colleagues, we think 1293284 be the root cause of this crash, since this crash takes place when initializing the content process, and thus we have a pretty short uptime (~2 sec) here. However, I failed reproduce this manually, and I am trying to do it automatically.

Updated

•

8 years ago

Depends on: 1293284

BugBot [:suhaib / :marco/ :calixte]

Comment 4

•

8 years ago

Crash volume for signature 'mozilla::ipc::BackgroundChildImpl::ProcessingError': - nightly (version 54): 28 crashes from 2017-01-23. - aurora (version 53): 773 crashes from 2017-01-23. - beta (version 52): 38 crashes from 2017-01-23. - release (version 51): 88 crashes from 2017-01-16. - esr (version 45): 1 crash from 2016-08-10. Crash volume on the last weeks (Week N is from 02-06 to 02-12): W. N-1 W. N-2 W. N-3 W. N-4 W. N-5 W. N-6 W. N-7 - nightly 1 24 - aurora 390 2 - beta 31 3 - release 50 14 0 - esr 0 0 0 0 0 0 0 Affected platforms: Windows, Mac OS X, Linux Crash rank on the last 7 days: Browser Content Plugin - nightly #55 - aurora #3 - beta #7311 #191 - release #340 - esr

status-firefox51: --- → affected

status-firefox52: --- → affected

status-firefox54: --- → affected

status-firefox-esr45: --- → affected

Nicholas Nethercote [inactive]

Comment 5

•

8 years ago

#3 Windows topcrash in Aurora 20170217004020.

Hsin-Yi Tsai (she/her) [:hsinyi]

Comment 6

•

8 years ago

Update: Talked with Kanru that he hopes to get to bug 1293284 shortly, which is believed the root cause of this crash pattern.

Nicholas Nethercote [inactive]

Comment 7

•

8 years ago

#3 Windows topcrash in Aurora 20170324004022, still.

Ben Kelly [:bkelly, not reviewing]

Comment 8

•

8 years ago

(In reply to Nicholas Nethercote [:njn] from comment #7) > #3 Windows topcrash in Aurora 20170324004022, still. If the theory that these are due to bug 1293284, is it possible this is due to enable multi-e10s on aurora? Since SWM will always start when the content process is spawned it is very likely to be on the stack if we do this early-shutdown-crash thing. Or is there something in the tree launching and killing content processes very quickly?

Nicholas Nethercote [inactive]

Comment 9

•

8 years ago

#3 Windows topcrash in Aurora 20170331004006, still!

Julien Cristau [:jcristau]

Comment 10

•

8 years ago

Mass wontfix for bugs affecting firefox 52.

status-firefox52: affected → wontfix

[:philipp]

Comment 11

•

8 years ago

many of the crash comments link the issue to running firefox within the external sandboxie tool.

Hsin-Yi Tsai (she/her) [:hsinyi]

Updated

•

8 years ago

Priority: P1 → P2

Hsin-Yi Tsai (she/her) [:hsinyi]

Updated

•

7 years ago

Assignee: bhsu → nobody

Hsin-Yi Tsai (she/her) [:hsinyi]

Comment 12

•

7 years ago

Bug 1293284 was fixed. Let's see how the crash volume was affected.

Sylvestre Ledru [:Sylvestre]

Comment 13

•

6 years ago

Moving to p3 because no activity for at least 1 year(s). See https://github.com/mozilla/bug-handling/blob/master/policy/triage-bugzilla.md#how-do-you-triage for more information

Priority: P2 → P3

Emily Toop (:fluffyemily)

Comment 14

•

5 years ago

Hi Hsin-Yi. This issue has seen a recent uptick in crashes on 77 in Fenix Beta (1340 crashes in the last 7 days). Can someone please take a look?

Flags: needinfo?(htsai)

Emily Toop (:fluffyemily)

Updated

•

5 years ago

Whiteboard: [geckoview][fenix:p1]

Hsin-Yi Tsai (she/her) [:hsinyi]

Comment 15

•

5 years ago

(In reply to Emily Toop (:fluffyemily) from comment #14)

Hi Hsin-Yi. This issue has seen a recent uptick in crashes on 77 in Fenix Beta (1340 crashes in the last 7 days). Can someone please take a look?

Deferring to Jens :)

Flags: needinfo?(htsai) → needinfo?(jstutte)

Andrew Sutherland [:asuth] (he/him)

Comment 16

•

5 years ago

The crash signature is not ServiceWorker specific now (if it ever was).

On Fenix many of these crashes are all coming from the "Socket Thread". There are some Firefox crashes that happen across more threads. They are all happening in processes that identify themselves as content processes and accordingly have no "IPDL Background" thread. The main loops of the content processes don't seem to indicate that they think they are in shutdown.

I presume something weird is happening with either intentional shutdown of the parent where it didn't want for the child, or unintentional termination of the parent that leaves the child able to leave a crash report.

In any event, it seems like the BackgroundChild instances don't need to cause crashes now that we generally accept that messages can and will be dropped?

Severity: critical → --

Component: DOM: Service Workers → IPC

Flags: needinfo?(jld)

Priority: P3 → --

Summary: Crash in mozilla::ipc::BackgroundChildImpl::ProcessingError from PBackgroundChild::SendPServiceWorkerManagerConstructor → Crash in content processes in BackgroundChildImpl::ProcessingError with message "MsgDropped: Channel error: cannot send/recv"

Andrew Sutherland [:asuth] (he/him)

Updated

•

5 years ago

Flags: needinfo?(jstutte)

Emily Toop (:fluffyemily)

Comment 17

•

5 years ago

Any chance someone could take a look at this crash please? It's a fairly big Fenix crasher.

Jed Davis [:jld] ⟨⏰|UTC-8⟩ ⟦he/him⟧

Assignee

Comment 18

•

5 years ago

So, most toplevels either ignore MsgDropped or, in some cases like GMPChild, immediately exit the process on (I assume) the assumption that the channel being unexpectedly closed or otherwise breaking means that the other process (in that case, the parent) has probably exited and that's the most useful thing it can do.

The weird thing about these crashes is that they're from our crash reporter, which runs in the parent process and therefore requires it to still exist, but the most obvious reason to get into this state is the parent process itself having crashed (or being killed, especially on Android because of the LMK). We've had bugs before where Android's own crash reporter was reporting content process crashes (deliberate) after the parent process had crashed (due to some other bug) and that at least made sense.

The crash message just tells us that the channel was in ChannelError state, which has no associated info about why. It could be from an actual I/O error (as in the case where the other process crashes)… or a channel being closed while it's still in the ChannelOpening state. PBackground channels are closed when the child thread that owns them exits, so I wonder if this is happening during shutdown, although from comment #16 that may not make sense.

In any case I think we can just ignore the error, and if there's some user-visible problem underlying this we'll hopefully find out in some other, more actionable way.

Assignee: nobody → jld

Flags: needinfo?(jld)

OS: Windows 8 → Unspecified

Hardware: x86 → Unspecified

Jed Davis [:jld] ⟨⏰|UTC-8⟩ ⟦he/him⟧

Assignee

Comment 19

•

5 years ago

Attached file Bug 1325918 - Ignore MsgDropped errors in BackgroundChildImpl. — Details

Pulsebot

Comment 20

•

5 years ago

Pushed by jedavis@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/9465a3d25cf9 Ignore MsgDropped errors in BackgroundChildImpl. r=nika

Dorel Luca [:dluca]

Comment 21

•

5 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/9465a3d25cf9

Status: NEW → RESOLVED

Closed: 5 years ago

status-firefox80: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla80

Ryan VanderMeulen [:RyanVM]

Updated

•

5 years ago

status-firefox51: affected → wontfix

status-firefox53: affected → wontfix

status-firefox54: affected → wontfix

status-firefox78: --- → wontfix

status-firefox79: --- → wontfix

status-firefox-esr45: affected → wontfix

status-firefox-esr68: --- → wontfix

status-firefox-esr78: --- → wontfix

You need to log in before you can comment on or make changes to this bug.