Closed Bug 1432409 Opened 7 years ago Closed 7 years ago

Crash in IPCError-content | PRemotePrintJob::Msg_StatusChange Route error: message sent to unknown actor ID

Categories

(Core :: Printing: Output, defect, P3)

59 Branch
All
Windows
defect

Tracking

()

VERIFIED FIXED
mozilla60
Tracking Status
firefox-esr52 --- unaffected
firefox58 --- unaffected
firefox59 + verified
firefox60 + verified

People

(Reporter: philipp, Assigned: jwatt)

References

Details

(Keywords: crash, regression, topcrash, Whiteboard: sb-)

Crash Data

Attachments

(2 files)

This bug was filed from the Socorro interface and is report bp-3e0cdfc6-6add-4435-8bb1-0af9d0180123. ============================================================= Top 10 frames of crashing thread: 0 xul.dll mozilla::dom::ContentChild::ProcessingError dom/ipc/ContentChild.cpp:2372 1 xul.dll mozilla::ipc::MessageChannel::MaybeHandleError ipc/glue/MessageChannel.cpp:2513 2 xul.dll mozilla::ipc::MessageChannel::DispatchMessageW ipc/glue/MessageChannel.cpp:2040 3 xul.dll mozilla::ipc::MessageChannel::RunMessage ipc/glue/MessageChannel.cpp:1886 4 xul.dll mozilla::ipc::MessageChannel::MessageTask::Run ipc/glue/MessageChannel.cpp:1919 5 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1040 6 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:517 7 xul.dll mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:125 8 xul.dll mozilla::ipc::MessagePumpForChildProcess::Run ipc/glue/MessagePump.cpp:301 9 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:319 ============================================================= these printing related content crashes are newly showing up in firefox 59.
Whiteboard: sb?
We don't think this is related to sandboxing changes for printing.
Whiteboard: sb? → sb-
Crash Signature: [@ IPCError-content | PPrintProgressDialog::Msg_CancelledCurrentJob Route error: message sent to unknown actor ID] → [@ IPCError-content | PPrintProgressDialog::Msg_CancelledCurrentJob Route error: message sent to unknown actor ID] [@ IPCError-browser | PRemotePrintJob::Msg_StatusChange Route error: message sent to unknown actor ID]
Ah, I can produce this bug with extension using by saveAsPDF(). STR is similar to bug 1429707. 1) Install the addon which uses saveAsPDF() function. 2) Load content 3) Call saveAsPDF() function before loading content is finished. 4) Click print button on the native print dialog which is displayed after loading content is finished.
is it possible to get to a regression range with these steps? (i'm not able to reproduce it this way but maybe i'm missing something)
Flags: needinfo?(mantaroh)
I pinged mantaroh on IRC for more information, and I can reproduce with the Print to PDF add-on loading https://www.bbc.co.uk/ if I throttle my bandwidth to make the page load more slowly.
There seem to be at least four bugs contributing to this issue: 1. When a user tries to save-as-PDF during page load, we prompt them for a file name and location, generate the PDF and save it, all before the page has even loaded. 2. After the page loads we open a second print dialog asking the user if they want to print to PDF (if the user goes ahead, this is what we crash under). We should not be prompting the user with this particular dialog window. 3. The save-as-PDF print settings object from the second dialog has an empty file name. 4. When handling an error initializing the print due to the file name being empty, our e10s printing code destroys the RemotePrintJobChild then tries to send a notification to it (which is what is resulting in the crash).
Some more information on issue #4 from comment 5: We try to initialize a print job under: RemotePrintJobChild::InitializePrint nsDeviceContextSpecProxy::BeginDocument nsDeviceContext::BeginDocument nsPrintJob::SetupToPrintContent nsPrintJob::DocumentReadyForPrinting nsPrintJob::AfterNetworkPrint nsPrintJob::OnStateChange nsDocLoader::DoFireOnStateChange nsDocLoader::FireOnStateChange nsDocLoader::doStopURLLoad nsDocLoader::OnStopRequest mozilla::net::nsLoadGroup::RemoveRequest imgRequestProxy::RemoveFromLoadGroup() imgRequestProxy::OnLoadComplete mozilla::image::SyncNotifyInternal::__l30::<lambda> mozilla::image::ImageObserverNotifier::operator() mozilla::image::SyncNotifyInternal mozilla::image::ProgressTracker::SyncNotifyProgress::__l2::<lambda> mozilla::image::CopyOnWrite<mozilla::image::ObserverTable>::Read mozilla::image::ProgressTracker::SyncNotifyProgress mozilla::image::VectorImage::OnSVGDocumentLoaded() mozilla::image::SVGLoadEventListener::HandleEvent mozilla::EventListenerManager::HandleEventSubType mozilla::EventListenerManager::HandleEventInternal mozilla::EventListenerManager::HandleEvent mozilla::EventTargetChainItem::HandleEvent mozilla::EventTargetChainItem::HandleEventTargetChain mozilla::EventDispatcher::Dispatch mozilla::EventDispatcher::DispatchDOMEvent nsINode::DispatchEvent mozilla::AsyncEventDispatcher::Run() nsThread::ProcessNextEvent In the parent we fail to make a PrintTarget because we return NS_ERROR_FAILURE from: nsLocalFile::InitWithPath nsDeviceContextSpecWin::MakePrintTarget nsDeviceContext::InitForPrinting mozilla::layout::RemotePrintJobParent::InitializePrintDevice mozilla::layout::RemotePrintJobParent::RecvInitializePrint This happens because the passed path is empty, which is because when MakePrintTarget called mPrintSettings->GetToFileName(filename) the returned file name was empty. When the error is returned up to RemotePrintJobParent::RecvInitializePrint, that function does two things: 1. sends the error code to the child 2. then calls Send__delete__ to tell the child process to delete its RemotePrintJobChild A failure result is sent back to the child process resulting in mInitializationResult having the value NS_ERROR_FAILURE in the RemotePrintJobChild::InitializePrint call above, which returns that error up the stack. We return up the stack to nsPrintJob::AfterNetworkPrint which handles the failure by calling CleanupOnFailure, under which we dispatch an nsPrintCompletionEvent: nsPrintCompletionEvent::nsPrintCompletionEvent nsPrintJob::FirePrintCompletionEvent nsPrintJob::CleanupOnFailure nsPrintJob::AfterNetworkPrint After returning to the event loop we handle the second message from the parent process telling the child to delete the RemotePrintJobChild: mozilla::layout::RemotePrintJobChild::ActorDestroy mozilla::layout::PRemotePrintJobChild::DestroySubtree mozilla::layout::PRemotePrintJobChild::OnMessageReceived mozilla::dom::PContentChild::OnMessageReceived mozilla::dom::ContentChild::OnMessageReceived mozilla::ipc::MessageChannel::DispatchAsyncMessage mozilla::ipc::MessageChannel::DispatchMessageW mozilla::ipc::MessageChannel::RunMessage mozilla::ipc::MessageChannel::MessageTask::Run nsThread::ProcessNextEvent After returning to the event loop again we handle the nsPrintCompletionEvent and try to use the destroyed RemotePrintJobChild in nsDeviceContextSpecProxy::EndDocument, causing us to crash: MOZ_CrashOOL mozilla::ipc::LogicError mozilla::layout::PRemotePrintJob::Transition mozilla::layout::PRemotePrintJobChild::SendFinalizePrint() nsDeviceContextSpecProxy::EndDocument() nsDeviceContext::EndDocument() nsPrintData::~nsPrintData() [External Code] Annotated Frame nsPrintData::Release() mozilla::RefPtrTraits<nsPrintData>::Release RefPtr<nsPrintData>::ConstRemovingRefPtrTraits<nsPrintData>::Release RefPtr<nsPrintData>::assign_assuming_AddRef RefPtr<nsPrintData>::operator= nsPrintJob::Destroy() nsDocumentViewer::OnDonePrinting() nsPrintCompletionEvent::Run() mozilla::SchedulerGroup::Runnable::Run() nsThread::ProcessNextEvent needinfo'ing Bob since he may have some thoughts on this.
Flags: needinfo?(bobowencode)
(In reply to Jonathan Watt [:jwatt] (needinfo? me) from comment #6) > mozilla::layout::PRemotePrintJobChild::SendFinalizePrint() Although that seems to be the signature for bug 1429707. mantaroh, did you really get crashes with the signatures covered by this bug using the steps to reproduce you gave? Or do the steps to reproduce belong in bug 1429707?
Currently on beta this is the #4 content process crash and the #8 crash across all processes.
Keywords: topcrash
I'm going to put up a patch to fix the problem described in comment 6 by making nsDeviceContextSpecProxy's BeginDocument, EndDocument and AbortDocument no-ops. However, there's yet another issue. AfterNetworkPrint not only dispatches an nsPrintCompletionEvent, but we also end up creating an event under it to post a status change event to the parent: MessageLoop::PostTask_Helper MessageLoop::PostTask mozilla::ipc::ProcessLink::SendMessageW mozilla::ipc::MessageChannel::SendMessageToLink mozilla::ipc::MessageChannel::Send mozilla::layout::PRemotePrintJobChild::SendStatusChange mozilla::layout::RemotePrintJobChild::OnStatusChange nsPrintData::DoOnStatusChange nsPrintJob::FirePrintingErrorEvent nsPrintJob::CleanupOnFailure nsPrintJob::AfterNetworkPrint At the time RemotePrintJobChild::OnStatusChange is called mDestroyed is false so we don't detect the problem at that point. Once we try to process it we crash under the following stack though since there's nothing to dispatch to: mozilla::ipc::MessageChannel::MaybeHandleError mozilla::ipc::MessageChannel::DispatchAsyncMessage mozilla::ipc::MessageChannel::DispatchMessageW mozilla::ipc::MessageChannel::RunMessage mozilla::ipc::MessageChannel::MessageTask::Run nsThread::ProcessNextEvent Although we can't use mDestroyed to detect this scenario, we can check that mInitializedResult is a success value. I'll do that in a separate patch though.
And to correlate the patches with the signatures, part 1 should fix: [@ IPCError-content | PPrintProgressDialog::Msg_CancelledCurrentJob Route error: message sent to unknown actor ID] although for me I couldn't get a crash for cancelling, only for "OK"ing. Making both EndDocument and AbortDocument no-ops should prevent crashes in both cases though. Part 2 should fix: [@ IPCError-browser | PRemotePrintJob::Msg_StatusChange Route error: message sent to unknown actor ID]
Assignee: nobody → jwatt
Requesting tracking given the high crash rate.
Comment on attachment 8952264 [details] Bug 1432409 part 1 - Prevent nsDeviceContextSpecProxy using RemotePrintJobChild if initialization fails. https://reviewboard.mozilla.org/r/221504/#review227318 ::: commit-message-679f0:5 (Diff revision 1) > +waits for a reply. If the parent fails to initialize printing it will > +send back an error message followed immediately by a second message > +telling the child process to delete its RemotePrintJobParent. The error I think you mean "delete its RemotePrintJobChild" (not Parent), right? (based on comment 6) ::: commit-message-679f0:12 (Diff revision 1) > +telling the child process to delete its RemotePrintJobParent. The error > +message causes the nested event loop to terminate and blocks > +RemotePrintJobChild::InitializePrint. We then do various async things > +to clean up, some of which can try to post messages to the parent > +process's RemotePrintJobParent. This is a problem since the delete > +message is pending in the content process's event loop resulting in a s/content/child/ for consistency. (This is the only mention of "content process" here -- the rest of this message says "child process" & "parent process") ::: widget/nsDeviceContextSpecProxy.cpp:167 (Diff revision 1) > + if (mRemotePrintJob) { > - Unused << mRemotePrintJob->SendAbortPrint(NS_OK); > + Unused << mRemotePrintJob->SendAbortPrint(NS_OK); > + } > return NS_OK; > } > > NS_IMETHODIMP > nsDeviceContextSpecProxy::BeginPage() > { > mRecorder->OpenFD(mRemotePrintJob->GetNextPageFD()); Looks like you're guarding all of the usages of mRemotePrintJob, *except* for this one in nsDeviceContextSpecProxy::BeginPage() & the one in ::EndPage. Could you mention in the commit message why these ones are safe? (Maybe they're called later in the process after we're sure that the parent didn't prompt us to delete our mRemotePrintJob?)
Attachment #8952264 - Flags: review?(dholbert) → review+
Comment on attachment 8952265 [details] Bug 1432409 part 2 - Make RemotePrintJobChild::OnStatusChange no-op if initialization failed. https://reviewboard.mozilla.org/r/221506/#review227322 Seems reasonable. r=me
Attachment #8952265 - Flags: review?(dholbert) → review+
Thanks, Daniel. I'll fix up the comments and push.
Pushed by jwatt@jwatt.org: https://hg.mozilla.org/integration/mozilla-inbound/rev/bd244079d11a part 1 - Prevent nsDeviceContextSpecProxy using RemotePrintJobChild if initialization fails. r=dholbert https://hg.mozilla.org/integration/mozilla-inbound/rev/9cd1539c04f9 part 2 - Make RemotePrintJobChild::OnStatusChange no-op if initialization failed. r=dholbert
Blocks: 499025
Comment on attachment 8952264 [details] Bug 1432409 part 1 - Prevent nsDeviceContextSpecProxy using RemotePrintJobChild if initialization fails. Approval Request Comment [Feature/Bug causing the regression]: unknown [User impact if declined]: high beta crash rate will uplift to release [Is this code covered by automated tests?]: there are some printing tests [Has the fix been verified in Nightly?]: it's stuck on m-i (requesting approval early to try to make b12) [Needs manual test from QE? If yes, steps to reproduce]: probably useful - STR in comment 2 using https://www.bbc.co.uk/ [List of other uplifts needed for the feature/fix]: none [Is the change risky?]: low/medium [Why is the change risky/not risky?]: stops us trying to access a destroyed object and crashing. As best I can tell the rest of the code flow after that point should be safe. [String changes made/needed]: none
Attachment #8952264 - Flags: approval-mozilla-beta?
Attachment #8952265 - Flags: approval-mozilla-beta?
FWIW the changes made here may fix some of our other print crashes (the stacks for some other bugs seem related).
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
[ Triage 2017/02/20: P3 ]
Priority: -- → P3
Comment on attachment 8952264 [details] Bug 1432409 part 1 - Prevent nsDeviceContextSpecProxy using RemotePrintJobChild if initialization fails. Fix for a top crash in beta, let's take this for beta 12.
Attachment #8952264 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #8952265 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Crash Signature: [@ IPCError-content | PPrintProgressDialog::Msg_CancelledCurrentJob Route error: message sent to unknown actor ID] [@ IPCError-browser | PRemotePrintJob::Msg_StatusChange Route error: message sent to unknown actor ID] → [@ IPCError-content | PPrintProgressDialog::Msg_CancelledCurrentJob Route error: message sent to unknown actor ID] [@ IPCError-browser | PRemotePrintJob::Msg_StatusChange Route error: message sent to unknown actor ID] [@ mozilla::ipc::LogicError | mozil…
Flags: needinfo?(jwatt)
Flags: needinfo?(mantaroh)
Flags: qe-verify+
From the 59b12 reports so far, it looks like the patches here fixed the PRemotePrintJob::Msg_StatusChange crash but not the PPrintProgressDialog::Msg_CancelledCurrentJob crash. :-/
Hello, I've tried reproducing this issue but to no avail. The report here (https://goo.gl/TWrxeT) shows us that most of the crashes were reported on Windows 7 using a Beta 59.0b11 build and the STRs say that I should use an "addon which uses saveAsPDF() function". I've installed "Print to PDF" and tried saving the www.bbc.com page as a pdf while it was loading multiple times but couldn't reproduce the crash. Also tried on Windows 10 x64 on a an older 58.0a1 build. Mantaroh, can you please help to verify this issue?
Flags: needinfo?(mantaroh)
(In reply to Jonathan Watt [:jwatt] (needinfo? me) from comment #26) > From the 59b12 reports so far, it looks like the patches here fixed the > PRemotePrintJob::Msg_StatusChange crash but not the > PPrintProgressDialog::Msg_CancelledCurrentJob crash. :-/ i've filed bug 1441598 as a followup for the remaining signature.
Thanks!
Crash Signature: [@ IPCError-content | PPrintProgressDialog::Msg_CancelledCurrentJob Route error: message sent to unknown actor ID] [@ IPCError-browser | PRemotePrintJob::Msg_StatusChange Route error: message sent to unknown actor ID] [@ mozilla::ipc::LogicError | mozil… → [@ IPCError-browser | PRemotePrintJob::Msg_StatusChange Route error: message sent to unknown actor ID] [@ mozilla::ipc::LogicError | mozilla::layout::PRemotePrintJobChild::SendFinalizePrint]
Flags: needinfo?(jwatt)
Summary: Crash in IPCError-content | PPrintProgressDialog::Msg_CancelledCurrentJob Route error: message sent to unknown actor ID → Crash in IPCError-content | PRemotePrintJob::Msg_StatusChange Route error: message sent to unknown actor ID
(In reply to Alexandru Simonca, QA (:asimonca) from comment #27) > Hello, > I've tried reproducing this issue but to no avail. The report here > (https://goo.gl/TWrxeT) shows us that most of the crashes were reported on > Windows 7 using a Beta 59.0b11 build and the STRs say that I should use an > "addon which uses saveAsPDF() function". I've installed "Print to PDF" and > tried saving the www.bbc.com page as a pdf while it was loading multiple > times but couldn't reproduce the crash. Also tried on Windows 10 x64 on a an > older 58.0a1 build. > > Mantaroh, can you please help to verify this issue? Hi Alexandru, I can reproduce on my environment(win10 + Beta 59.0b11). STR: 1) Install the 'Print to PDF'.(https://addons.mozilla.org/en-US/firefox/addon/print-to-pdf-document/) 2) Go to the 'bbc.com' 3) Push 'Print to PDF' while loading the document and click the 'save' button of the dialog. 4) Push 'OK' button on the print dialog. (This dialog is shown after step 3)) Capture is as follow: https://drive.google.com/file/d/1L4VRhwOZlU-aUw6ftY8h_aYdlSXsPUyl/view?usp=sharing After updating the beta 59.0b13, I confirmed that this bug is fixed.(i.e. gecko doesn't crash after step 4)
Flags: needinfo?(mantaroh)
Hello, Based on Comment #30 and my own assessment of the issue and the 60.0a1 build I can say that this is VERIFIED FIXED in: - 59.0b13 - 60.0a1 (id: 20180227220155)
Status: RESOLVED → VERIFIED
Flags: qe-verify+
Flags: needinfo?(bobowencode)
See Also: → 1449120
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: