Closed Bug 1354200 Opened 7 years ago Closed 5 years ago

crash in mozilla::ipc::MessageChannel::OnChannelErrorFromLink() on asan builds

Categories

(Core :: IPC, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla66
Tracking Status
geckoview64 --- fixed
firefox55 --- wontfix
firefox64 --- wontfix
firefox65 --- fixed
firefox66 --- fixed

People

(Reporter: hanno, Assigned: snorp)

References

Details

(Whiteboard: [geckoview:p1])

Crash Data

Attachments

(5 files)

I'm getting occasional crashes (with crash dumps) with the asan builds of mozilla. (asan build "target.tar.bz2" downloaded on 2017-04-04, sha256 69926deabcd04a4a2e6030001dae365bca5985ab83979b42bf1675b969e2377f)

The crashes usually happen if I want to close the browser during heavy activity (I was experimenting with HTML files including lots of iframes). However I could also reproduce them most reliably by just starting firefox and killing it shortly after, e.g.:
./firefox & sleep 3; killall firefox

(This doesn't depend on "killall", it also sometimes works with clicking the close symbol, but needs the right timing)

I got different stack traces, but the by far most common one indicates a crash in  mozilla::ipc::MessageChannel::OnChannelErrorFromLink().

I got a different crash in mozilla::ipc::BackgroundChildImpl::ProcessingError(), however I was not able to reproduce that and only got it once.

A third crash is probably irrelevant: It happens in RunWatchdog(), looking at the code it is just some kind of timeout where it kills the application if it supposedly hangs. I just mention it for completeness.

I'll attach all three stack traces.
These are all various intentional crashes when we detect an error. The OnChannelError ones are here:
  MOZ_CRASH("Aborting on channel error.");
The nsTerminator one is here:
  MOZ_CRASH("Shutdown too long, probably frozen, causing a crash.");
Group: core-security
Priority: -- → P2
Depends on: 1397419
Blocks: 1397419
No longer depends on: 1397419
Crash Signature: mozilla::ipc::MessageChannel::OnChannelErrorFromLink
Crash Signature: mozilla::ipc::MessageChannel::OnChannelErrorFromLink → [@ mozilla::ipc::MessageChannel::OnChannelErrorFromLink ]
QA Contact: jmathies
QA Contact: jmathies
jld - tsmith mentioned that they hit this in ASan builds at the all hands, this is one of a number of similar bugs that I picked.

It looks like this [1] was originally put in to try and clean up orphaned plugin processes (and I guess subsequently all children).
I wonder if MOZ_CRASH is really what we want to do here or whether something like ProcessChild::QuickExit would be more appropriate.

[1] https://searchfox.org/mozilla-central/rev/fd62b95c187a40b328d9e7fd9d848833a6942b57/ipc/glue/MessageChannel.cpp#2527
Flags: needinfo?(jld)
It does look like that was more or less the intent in bug 778866.  Assuming nobody's trying to get information out of the crashes, I agree this could just be a QuickExit, but I think we should log something in case the exit happens when it shouldn't (cf. bug 1368036).
Flags: needinfo?(jld)
Android mercilessly kills the parent in low memory situations, and we
don't want that to trigger a crash when the child is abruptly
disconnected.
Whiteboard: [geckoview]
Assignee: nobody → snorp
Pushed by jwillcox@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/36802e2a3490
Exit instead of MOZ_CRASH on channel error in child process r=jld

Comment on attachment 9035725 [details]
Bug 1354200 - Exit instead of MOZ_CRASH on channel error in child process r=jld

[Beta/Release Uplift Approval Request]

Feature/Bug causing the regression: N/A

User impact if declined: Frequent crashes when Firefox Focus is in the background.

Is this code covered by automated tests?: No

Has the fix been verified in Nightly?: No

Needs manual test from QE?: No

If yes, steps to reproduce:

List of other uplifts needed: None

Risk to taking this patch: Low

Why is the change risky/not risky? (and alternatives if risky): Very small and simply avoids an intentional crash.

String changes made/needed: None

Attachment #9035725 - Flags: approval-mozilla-release?
Attachment #9035725 - Flags: approval-mozilla-beta?
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla66

(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #10)

Attachment #9035725 [details] - Flags: approval-mozilla-release?

James, are you requesting uplift to just the GECKOVIEW_64_RELBRANCH branch? Or also Fennec 64 and Firefox 64 desktop on the Release channel?

IIUC, this fix won't affect Fennec (because it doesn't use e10s), but it could affect/benefit Firefox desktop.

Flags: needinfo?(snorp)
Whiteboard: [geckoview] → [geckoview:p1]

Comment on attachment 9035725 [details]
Bug 1354200 - Exit instead of MOZ_CRASH on channel error in child process r=jld

Moving uplift request to gv64, I don't think we should be taking this on the default branch.

Attachment #9035725 - Flags: approval-mozilla-release?
Attachment #9035725 - Flags: approval-mozilla-release-
Attachment #9035725 - Flags: approval-mozilla-geckoview64?

Thanks. Julien can we poke someone to approve and land on geckoview64? That's the only current vehicle we have to confirm the fix, and there is very low risk.

Flags: needinfo?(snorp) → needinfo?(jcristau)

Comment on attachment 9035725 [details]
Bug 1354200 - Exit instead of MOZ_CRASH on channel error in child process r=jld

You can poke me :)

Approved for beta65 and gv64, will land on the latter in a bit.

Flags: needinfo?(jcristau)
Attachment #9035725 - Flags: approval-mozilla-geckoview64?
Attachment #9035725 - Flags: approval-mozilla-geckoview64+
Attachment #9035725 - Flags: approval-mozilla-beta?
Attachment #9035725 - Flags: approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: