Closed Bug 1441598 Opened 7 years ago Closed 7 years ago

Crash in IPCError-content | PPrintProgressDialog::Msg_CancelledCurrentJob Route error: message sent to unknown actor ID

Categories

(Core :: Printing: Output, defect)

59 Branch
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla60
Tracking Status
relnote-firefox --- 59+
firefox-esr52 --- unaffected
firefox58 --- unaffected
firefox59 + verified
firefox60 + verified

People

(Reporter: philipp, Assigned: bobowen)

References

Details

(Keywords: crash, regression, topcrash)

Crash Data

Attachments

(2 files, 1 obsolete file)

This bug was filed from the Socorro interface and is report bp-41302054-6f8f-443c-afe3-eca7f0180224. ============================================================= Top 10 frames of crashing thread: 0 xul.dll mozilla::dom::ContentChild::ProcessingError dom/ipc/ContentChild.cpp:2373 1 xul.dll mozilla::ipc::MessageChannel::MaybeHandleError ipc/glue/MessageChannel.cpp:2513 2 xul.dll mozilla::ipc::MessageChannel::DispatchAsyncMessage ipc/glue/MessageChannel.cpp:2112 3 xul.dll mozilla::ipc::MessageChannel::DispatchMessageW ipc/glue/MessageChannel.cpp:2040 4 xul.dll mozilla::ipc::MessageChannel::RunMessage ipc/glue/MessageChannel.cpp:1886 5 xul.dll mozilla::ipc::MessageChannel::MessageTask::Run ipc/glue/MessageChannel.cpp:1919 6 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1040 7 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:517 8 xul.dll mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:125 9 xul.dll mozilla::ipc::MessagePumpForChildProcess::Run ipc/glue/MessagePump.cpp:301 ============================================================= content crashes with this signature are regressing in firefox 59 on windows and linux. reports come with a generic MOZ_CRASH(Content child abort due to IPC error). we don't have much crash data available for 59.0a1 because crash reports that were submitted before december 26 got purged, but nightly build 20171130220131 is the earliest recorded crashing build. there weren't many printing related patches landing in the first weeks of the 59.0a1 cycle & because it looks related, i'd be suspicious of bug 1409971 as the source of the regression & i'm tentatively setting this bug as blocking it.
This is content crash #4 (crash #8 overall). Transferring over tracking flags from bug 1432409.
Keywords: topcrash
One of the comments says something like "Crash cancelling printing of Skarabee showcase poster" ("Plantage en voulant annuler une impression d'affiche vitrine sur Skarabee") and has the URL: https://www.skarabee.net/property/basic/basic.aspx
The problem here appears to be that we are relying on the Send__delete__ in ~PrintProgressDialogChild to prevent any more messages, but obviously we can still attempt to send them before the Recv__delete__ gets processed by the parent.
This is fairly high volume for beta so I expect it will show up more heavily on 59 release. I could still take a patch for the 59 RC build if you are able to come up with a fix, but I won't consider it a 59 blocker.
I can reproduce this fairly consistently when cancelling a one page print, just as it completes. (There is also a problem where if you cancel very early you break printing for that content process, but that is a different bug.) I have what I believe to be a safe change which seems to fix the issue for me. It may not fix all instances of this race, but I think a more comprehensive fix would carry quite a lot of risk on its own.
Assignee: nobody → bobowencode
Status: NEW → ASSIGNED
Comment on attachment 8955603 [details] [diff] [review] Don't try and send messages to PrintProgressDialogChild when printing is complete Review of attachment 8955603 [details] [diff] [review]: ----------------------------------------------------------------- ::: toolkit/components/printingui/nsPrintProgress.cpp @@ +181,5 @@ > NS_IMETHODIMP nsPrintProgress::OnStateChange(nsIWebProgress *aWebProgress, nsIRequest *aRequest, uint32_t aStateFlags, nsresult aStatus) > { > + if (XRE_IsE10sParentProcess() && > + aStateFlags & nsIWebProgressListener::STATE_STOP) { > + MOZ_ASSERT(m_observer); The assert seems a bit redundant since we'll deref and crash on the very next line. It would be helpful to have a comment here noting that we're notifying our PrintProgressDialogParent to avoid a race. @@ +202,5 @@ > > NS_IMETHODIMP nsPrintProgress::OnProgressChange(nsIWebProgress *aWebProgress, nsIRequest *aRequest, int32_t aCurSelfProgress, int32_t aMaxSelfProgress, int32_t aCurTotalProgress, int32_t aMaxTotalProgress) > { > + if (XRE_IsE10sParentProcess() && aCurSelfProgress >= aMaxSelfProgress) { > + MOZ_ASSERT(m_observer); Ditto.
Attachment #8955603 - Flags: review?(jwatt) → review+
(In reply to Bob Owen (:bobowen) from comment #5) ... > (There is also a problem where if you cancel very early you break printing > for that content process, but that is a different bug.) Another try push, this time with a fix for this bug as well, because I just realised what must be going on. This bug is in current release and probably has been for long time. https://treeherder.mozilla.org/#/jobs?repo=try&revision=e6c7ce7721b4ac3d3c7a7c0b6ef957c234c86b8c
Apparently there's a JS implemented nsIObserver for the progress dialog as well: https://treeherder.mozilla.org/#/jobs?repo=try&revision=dd0c5830fc816fc5399ec982720304c5d4f3925e
Attachment #8955603 - Attachment is obsolete: true
Attachment #8955930 - Flags: review?(jwatt) → review+
Attachment #8955931 - Flags: review?(jwatt) → review+
Pushed by bobowencode@gmail.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/b47e93718912 Don't try and send messages to PrintProgressDialogChild when printing is complete. r=jwatt https://hg.mozilla.org/integration/mozilla-inbound/rev/0036938db4fe Part 2: When cancelling a print make sure the DialogOpened message has been sent. r=jwatt
Pushed by bobowencode@gmail.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/bb6db24a20dd Backed out changeset 0036938db4fe - for accidental printf. https://hg.mozilla.org/integration/mozilla-inbound/rev/2219e0ec9d4f Part 2: When cancelling a print make sure the DialogOpened message has been sent. r=jwatt
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
Comment on attachment 8955930 [details] [diff] [review] Don't try and send messages to PrintProgressDialogChild when printing is complete Approval Request Comment [Feature/Bug causing the regression]: Bug 1409971 [User impact if declined]: User's trying to cancel a print around the time it completes will continue to get intermittent crashes. [Is this code covered by automated tests?]: Some of the code paths must be covered, because of failures on try pushes. These are probably around print preview, I don't think there are any currently running for actual printing. [Has the fix been verified in Nightly?]: I can no longer reproduce in latest Nightly. [Needs manual test from QE? If yes, steps to reproduce]: Cancelling a one page print as it completes, for example of http://example.com, reliably reproduces for me. Some general printing and print preview manual regression tests should be run. [List of other uplifts needed for the feature/fix]: Other patch from bug, fixes a related issue. [Is the change risky?]: Somewhat. [Why is the change risky/not risky?]: The change is relatively simple, but the printing code is fairly fragile. [String changes made/needed]: None.
Attachment #8955930 - Flags: approval-mozilla-beta?
Comment on attachment 8955931 [details] [diff] [review] Part 2: When cancelling a print make sure the DialogOpened message has been sent See comment 17.
Attachment #8955931 - Flags: approval-mozilla-beta?
Comment on attachment 8955930 [details] [diff] [review] Don't try and send messages to PrintProgressDialogChild when printing is complete This would need to land on m-r at this point and would end up in our RC2 build if taken. Which seems awful late :(. If nothing else, maybe we can keep it on the radar for dot-release consideration.
Attachment #8955930 - Flags: approval-mozilla-beta? → approval-mozilla-release?
Attachment #8955931 - Flags: approval-mozilla-beta? → approval-mozilla-release?
Comment on attachment 8955930 [details] [diff] [review] Don't try and send messages to PrintProgressDialogChild when printing is complete This is just a little too late for 59 for me. Since we have already built RC1, this could make it to RC2 but we don't have time for RC3 in case of needing backout or seeing regressions. I'll keep this open for 59 and keep it tracked for a potential dot release where we would have a little more breathing room.
Attachment #8955930 - Flags: approval-mozilla-release? → approval-mozilla-release-
Attachment #8955931 - Flags: approval-mozilla-release? → approval-mozilla-release-
Flags: qe-verify+
I reproduced the crash on Windows 10 x64 and Windows 7 x64 using Nightly 60.0a1(2018-03-01). Verified fixed on Windows 10 x64 and Windows 7 x64 using the latest Nightly (2018-03-11). I didn't manage to reproduce the crash on Ubuntu 16.04 x64 using Nightly 60.0a1(2018-03-01).
Status: RESOLVED → VERIFIED
Comment on attachment 8955930 [details] [diff] [review] Don't try and send messages to PrintProgressDialogChild when printing is complete This is the #10 content process crash on 59 so far. It's been landed for a couple weeks now with no known regressions and has been verified as fixing the crash. I think we should consider this as a dot release ride-along.
Attachment #8955930 - Flags: approval-mozilla-release- → approval-mozilla-release?
Attachment #8955931 - Flags: approval-mozilla-release- → approval-mozilla-release?
Comment on attachment 8955930 [details] [diff] [review] Don't try and send messages to PrintProgressDialogChild when printing is complete High-volume crash fix, verified on nightly, A+
Attachment #8955930 - Flags: approval-mozilla-release? → approval-mozilla-release+
Attachment #8955931 - Flags: approval-mozilla-release? → approval-mozilla-release+
I reproduced the crash on Windows 10 x64 and Windows 7 x64 using Firefox 59.0.1 Verified fixed on Windows 10 x64 and Windows 7 x64 using Firefox 59.0.2. I didn't manage to reproduce the crash on Ubuntu 16.04 x64 on Firefox 59.0.1.
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: