Closed Bug 1258609 Opened 5 years ago Closed 5 years ago

[e10s] Crash while printing via parent with pdf.js , about:memory shows large heap unclassified

Categories

(Core :: Printing: Output, defect)

defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla49
Tracking Status
e10s - ---
firefox49 --- fixed

People

(Reporter: mayankleoboy1, Assigned: bobowen)

References

Details

(Keywords: crash, hang, memory-footprint, Whiteboard: sbwc1)

Attachments

(6 files)

Attached file aboutsupport.txt
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:48.0) Gecko/20100101 Firefox/48.0
Build ID: 20160321030217

Steps to reproduce:

1. Create a new Nightly profile
2. Go to https://drive.google.com/folderview?id=0B3YMBdhaJ7qjQzFUTkVBQmZPMjQ&usp=drive_web&tid=0B3YMBdhaJ7qjUDNPNS1TX3JVVlU#list

3. Click on the first document "button". The doc should load as a pdf with inbuilt pdf.js
4. When the doc loads, click on the print button on top. The doc will open in a new tab, and a print dialogue will open.
5. Select "print to pdf",or "print ot XPS" or "print to foxit pdf"
6. Start a print


Actual results:

memory increases exponentially for both the content process and the firefox.exe process.
Eventuallu get a OOM/crash


Expected results:

not so.
Not sure if its related to pdf.js, graphics or printing.
https://crash-stats.mozilla.com/report/index/47bcc7f8-f673-4136-90b5-57c6e2160322
https://crash-stats.mozilla.com/report/index/147b0f51-05b6-420f-8dc1-1e6672160322
https://crash-stats.mozilla.com/report/index/58b6ff50-4c55-4798-9bcf-8cffb2160322

Happens only with e10s.
Summary: Crash while "printing to pdf" a goodle doc with pdf.js → [e10s] Crash while "printing to pdf" a goodle doc with pdf.js
Attached file memory-report.json.gz
Summary: [e10s] Crash while "printing to pdf" a goodle doc with pdf.js → [e10s] Crash while "printing to pdf" a goodle doc with pdf.js , about:memory shows large heap unclassified
Reproduced. Huge memory consumption with e10s.
Status: UNCONFIRMED → NEW
tracking-e10s: --- → ?
Ever confirmed: true
Keywords: footprint
Nightly48.0a1 hangs when printing with Microsoft XPS Document Writer.
stack(using crashfirefox.exe): bp-9abb8b61-6c28-4d56-9a83-9edec2160322
Severity: normal → critical
Keywords: crash, hang
Blocks: 1156742
No longer blocks: 899758
Thanks for reporting.

I tried this with a fairly large pdf print of part the HTML spec and get similar soaring memory.

It looks like printing from a PDF eats a lot of memory even in release, compared to printing the same thing directly from HTML.

The changes in bug 1156742 that record the print in the child and replay in the parent are probably just exacerbating this to the point that it is causing an out of memory crash.

On release most of the memory seems to get released after the print, but a certain part isn't.

I'll try and work out what we're holding on to during and after the print.
Blocks: e10s-perf
Priority: -- → P2
See Also: → 1259032
The problem here is that the pages are getting printed as single (8MB) images, which aren't released (in parent and child) until the end of the print.
So, if you print something big enough you get an OOM crash.

During the printing of each page a snapshot is taken using a DrawTarget that isn't our recording one and hasn't been created by our recording one (see attachment for stack).

This means that we can't record the snapshot and replay that in the parent, so instead when this surface is passed to the FillRect of our recording DrawTarget we extract the data (see attachment):

width=1190, height=1684, bytes per pixel=4 - giving our 8015840 byte surface

Even though they are only used once they aren't released in the child until the end of the print when they are cycle collected (see attachment).
In the parent we have to store them because they haven't been destroyed in the child and we release when the RemotePrintJobParent and PrintTranslator is destroyed.

Interestingly, we don't seem to get any leaked memory like you do when not printing via the parent (bug 1259032).

Even if we could release these surfaces per page, we'd still have a problem because the resulting print is just a series of images not a proper document.

I think that this doesn't happen when printing in process, because the snapshot is backed by some sort of cairo recording surface that has recorded the canvas draw instructions and when this gets passed to DrawTargetCairo::FillRect it does the right thing.

dholbert - not sure if you can shed any light on this, we need the DrawTarget used for the canvas rendering and snapshot to be our recording DrawTarget for the print (or one created from it).
Either that or change it so we don't print to some sort of canvas first.

(My terminology here is probably a bit awry as I don't know too much about some of this. :-) )
Flags: needinfo?(dholbert)
Summary: [e10s] Crash while "printing to pdf" a goodle doc with pdf.js , about:memory shows large heap unclassified → [e10s] Crash while printing via parent with pdf.js , about:memory shows large heap unclassified
Whiteboard: [sb?]
Whiteboard: [sb?] → sbwc1
No longer blocks: e10s-perf
Priority: P2 → --
(In reply to Bob Owen (:bobowen) (PTO back 1st Apr) from comment #11)
> Even though they are only used once they aren't released in the
> child until the end of the print when they are cycle collected (see attachment).

This may not help per your latter point about "Even if we could release these surfaces per page" -- but throwing it out there just in case: if there's some point in the code where we know we're done with these objects (I'm not sure) -- i.e. around when we drop our last reference beyond the cycle -- maybe we could *actively* call something that would break the cycle & allow everything to be freed immediately?  e.g. we could add some "Disconnect" method to whatever object we have a reference to, and we'd call that method just before we drop our reference. (Or even "ChildDisconnect" / "ParentDisconnect" if we want to wait until both processes are done with it.)

Anyway, maybe this is a moot point.

> dholbert - not sure if you can shed any light on this, we need the
> DrawTarget used for the canvas rendering and snapshot to be our recording
> DrawTarget for the print (or one created from it).

I don't know that I can shed any light on this, offhand -- sorry -- I'm fuzzy on what happens at this level of the printing code (with DrawTarget/surface handling / cairo).  Your suggestion sounds reasonable, though, if you're correct about this being the way we avoid ballooning in single-process mode.
Flags: needinfo?(dholbert)
Got back to this and added logging for all DrawTarget creation and snapshots and found the culprit at [1].

Fortunately this is called from the printing code, so I think the solution is a lot easier than I had feared.
The sensible thing seems to be creating our own similar DrawTarget and initializing with that instead of a similar surface.

This will make it easier if we want to move to a different backend for printing and means we can get rid of gfxContext::CurrentSurface.

Here's a try push.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=1c380ea041dcc24b7bdde49f70510cf2bd50c7a9

[1] https://dxr.mozilla.org/mozilla-central/rev/ae7413abfa4d3954a6a4ce7c1613a7100f367f9a/dom/canvas/CanvasRenderingContext2D.cpp#1631
Assignee: nobody → bobowen.code
Status: NEW → ASSIGNED
MozReview-Commit-ID: JNQ9GWvDUSq
Attachment #8743735 - Flags: review?(jmuizelaar)
Attachment #8743735 - Flags: review?(jmuizelaar) → review+
https://hg.mozilla.org/mozilla-central/rev/c992422247b7
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla49
Looks like a regression from 47. This might be worth uplifting to 48 at least. Bob, what do you think?
Flags: needinfo?(bobowen.code)
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #17)
> Looks like a regression from 47. This might be worth uplifting to 48 at
> least. Bob, what do you think?

Printing via the parent is only enabled on Nightly at the moment as it is required for sandboxing.
So, no uplift required.
Flags: needinfo?(bobowen.code)
Depends on: 1308259
Depends on: 1310165
No longer depends on: 1310165
Depends on: 1342395
You need to log in before you can comment on or make changes to this bug.