Closed Bug 1435495 Opened 6 years ago Closed 3 years ago

Crash in nsPrintJob::DonePrintingPages

Categories

(Core :: Printing: Output, defect, P3)

59 Branch
x86
Windows
defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox-esr52 --- unaffected
firefox58 --- unaffected
firefox59 --- wontfix
firefox60 --- wontfix
firefox61 --- wontfix
firefox62 --- fix-optional
firefox63 --- wontfix
firefox64 --- fix-optional

People

(Reporter: philipp, Unassigned)

References

(Depends on 1 open bug)

Details

(Keywords: crash, regression, Whiteboard: qa-not-actionable)

Crash Data

This bug was filed from the Socorro interface and is
report bp-f79c86c0-5dd4-40ea-9e4c-f53040180203.
=============================================================

Top 10 frames of crashing thread:

0 xul.dll nsPrintJob::DonePrintingPages layout/printing/nsPrintJob.cpp:3089
1 xul.dll nsPrintJob::DocumentReadyForPrinting layout/printing/nsPrintJob.cpp:1579
2 xul.dll nsPrintJob::FinishPrintPreview layout/printing/nsPrintJob.cpp:3438
3 xul.dll nsPrintJob::AfterNetworkPrint layout/printing/nsPrintJob.cpp:2042
4 xul.dll nsPrintJob::OnStateChange layout/printing/nsPrintJob.cpp:2077
5 xul.dll nsDocLoader::DoFireOnStateChange uriloader/base/nsDocLoader.cpp:1319
6 xul.dll nsDocLoader::FireOnStateChange uriloader/base/nsDocLoader.cpp:1282
7 xul.dll nsDocLoader::doStopURLLoad uriloader/base/nsDocLoader.cpp:819
8 xul.dll nsDocLoader::OnStopRequest uriloader/base/nsDocLoader.cpp:615
9 xul.dll mozilla::net::nsLoadGroup::RemoveRequest netwerk/base/nsLoadGroup.cpp:629

=============================================================

this crash signature is newly appearing on 59.0b builds - perhaps related to bug 1399787.
it's a content crash on win32 builds with rather low crash volume.
(In reply to [:philipp] from comment #0)
> this crash signature is newly appearing on 59.0b builds - perhaps related to
> bug 1399787.

I should hope not. PDFium isn't even built on non-Nightly. Are you just speculating, or is there some reason you suspect that bug? There have been a good number of other printing related changes and I'd think one of those (or something tangentially related) is more likely.
Current crash counts:

59.0b4  2   4.7%
59.0b5  26  60.5%
59.0b6  14  32.6%
59.0b7  1   2.3%

So something that landed in b4/b5?
i was just looking at the top frame in the crash stack: https://hg.mozilla.org/releases/mozilla-beta/annotate/64737c752ac4/layout/printing/nsPrintJob.cpp#l3089 (it appeared to have been last touched by bug 1399787 - no further consideration going into that statement)

bugs landing between b4 and b5 would have been: https://mzl.la/2E80pdt
(In reply to Jonathan Watt [:jwatt] (needinfo? me) from comment #2)
> Current crash counts:
> 
> 59.0b4  2   4.7%
> 59.0b5  26  60.5%
> 59.0b6  14  32.6%
> 59.0b7  1   2.3%
> 
> So something that landed in b4/b5?

That's possibly more just the earlier Betas being dev-edition only (I think).
(In reply to Jonathan Watt [:jwatt] (needinfo? me) from comment #2)
> Current crash counts:
> 
> 59.0b4  2   4.7%
> 59.0b5  26  60.5%
> 59.0b6  14  32.6%
> 59.0b7  1   2.3%

These numbers are wrong. It seems that when you search for the crashes over the last three months, the table that you're presented with showing the crashes per beta is actually only 7 days worth of data. :/

A better search is:

https://crash-stats.mozilla.com/search/?signature=~nsPrintJob%3A%3ADonePrintingPages&date=%3E%3D2018-01-01T11%3A18%3A00.000Z&date=%3C2018-02-08T11%3A18%3A25.000Z&_sort=-date&_facets=signature&_facets=version&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-version

which currently shows:

  59.0b2  1
  59.0b4  20
  59.0b5  36
  59.0b6  14
  59.0b7  3
(In reply to Bob Owen (:bobowen) from comment #4)
> That's possibly more just the earlier Betas being dev-edition only (I think).

jcristau	 tells me that b1 and b2 were dev edition only, that b3 had only a fraction of beta users, and that b4 was the first beta for the entire beta user population. So, yes, it looks like this doesn't point to any particular uplift window unfortunately.
The b2 crash has buildid 20180117222144 FWIW.
None of the crashes have a useful 'url' field (all about:printpreview, about:newtab or blank), none of them have a useful user comment, and none of them were sent with an email address to contact the user. :/
I've been looking at this on IRC with Bob, and it looks like bug 1436519 and this bug are the same thing, nsDeviceContext::UnregisterPageDoneCallback being the signature that shows up on x86 and nsPrintJob::DonePrintingPages being the signature that shows up on x64 (due to link time optimization inlining the UnregisterPageDoneCallback call there). He did the actual debugging and will comment on the details shortly but I'll dup bug 1436519 over in the meantime.
Crash Signature: [@ nsPrintJob::DonePrintingPages] → [@ nsPrintJob::DonePrintingPages] [@ nsDeviceContext::UnregisterPageDoneCallback]
(In reply to [:philipp] from comment #3)
> i was just looking at the top frame in the crash stack:
> https://hg.mozilla.org/releases/mozilla-beta/annotate/64737c752ac4/layout/
> printing/nsPrintJob.cpp#l3089 (it appeared to have been last touched by bug
> 1399787 - no further consideration going into that statement)

Oh, and I meant to agree, this does look suspicious. (And thanks for clarifying.)
This is caused by nsPrintJob::OnStateChange [1] getting called during the nested event loop that is spun while we're doing IPC with the parent to get the print settings via ShowPrintDialog [2].

nsPrintJob::OnStateChange gets down to a call to nsPrintJob::SetupToPrintContent, which must be failing presumably because the nsPrintJob is not in the correct state yet.
So, nsPrintJob::DonePrintingPages gets called and we hit [4].

However mPrintDC is null because it is due to be created just after the ShowPrintDialog [2] in the code running above the nested event loop.

In the x64 case (bug 1436519), we crash trying to access nsDeviceContext::mPrintTarget (with an offset of 0x48) at [5].
In the x86 case (this bug) the function has been in-lined, so we get the nsPrintJob::DonePrintingPages signature. This time with an offset of 0x30:

  printData->mPrintDC->UnregisterPageDoneCallback();
...
  mov         eax,dword ptr [esi+8]  
* mov         ecx,dword ptr [eax+30h]  *
  test        ecx,ecx  
  je          nsPrintJob::DonePrintingPages+0E5h (67C7757Fh)  
  add         ecx,28h  
  call        std::function<bool __cdecl(void)>::~function<bool __cdecl(void)> (665B2CD0h)  


The eventual real fix is probably to get rid of the nested event loop (bug 1432806), which while it meant we could get e10s landed, has caused us a number of issues.

Aside from that I'm not entirely sure why the nsPrintJob has been installed as an nsIWebProgressListener for nsPrintJob::OnStateChange to be called during [2], because that shouldn't happen until [6].
I think maybe the nsPrintJob gets re-used for Print Preview, so it could be that it has not been unhooked from a previous call.
Maybe we could ensure it is unhooked at the start of DoCommonPrint or at least before we spin any nested event loops.

[1] https://hg.mozilla.org/releases/mozilla-beta/annotate/1a24837f9ed2/layout/printing/nsPrintJob.cpp#l2058
[2] https://hg.mozilla.org/releases/mozilla-beta/annotate/1a24837f9ed2/layout/printing/nsPrintJob.cpp#l863
[3] https://hg.mozilla.org/releases/mozilla-beta/annotate/1a24837f9ed2/layout/printing/nsPrintJob.cpp#l1575
[4] https://hg.mozilla.org/releases/mozilla-beta/annotate/0c5a115449a3/layout/printing/nsPrintJob.cpp#l3090
[5] https://hg.mozilla.org/releases/mozilla-beta/annotate/1a24837f9ed2/gfx/src/nsDeviceContext.cpp#l748
[6] https://hg.mozilla.org/releases/mozilla-beta/annotate/1a24837f9ed2/layout/printing/nsPrintJob.cpp#l995
Depends on: 1432806
Flags: needinfo?(jwatt)
[ Triage 2017/02/20: P3 ]
Priority: -- → P3
A few crashes per week are showing up still in beta 62.
Too late to fix in 63, but we could still take a patch in 65 and potentially, 64 beta.
Whiteboard: qa-not-actionable

nsPrintJob::DonePrintingPages no longer exists.

Status: NEW → RESOLVED
Closed: 3 years ago
Flags: needinfo?(jwatt)
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.