959925 - pdf.js: too many copies of images

Assignee

Description

•

10 years ago

Decoding and painting an image in pdf.js involves lots of copies of the data.

(a) pdf.worker.js is a JS worker. It first decodes the colour data from the PDF
    into an RGB array (24 bits per pixel), and the opacity data into an A array
    (8 bits per pixel). These two combined are effectively copy 1.

(b) pdf.worker.js combines the RGB and A arrays into a new RGBA typed array,
    which has 32 bits per pixel. This is copy 2. Copy 1 is now dead, and will
    be reclaimed by the next GC.

(c) pdf.worker.js transfers copy 2 to the main pdf.js thread. (This is a
    transfer, not a copy, possible because it's a typed array.)

(d) pdf.js does |d = createImageData()|, which ends up in
    mozilla::dom::CreateImageData(), which allocates a Uint8ClampedArray. This
    is copy 3. pdf.js copies the typed arrays (copy 2) into the ImageData (copy
    3). Copy 2 is now dead, and will be reclaimed by the next GC.

(e) pdf.js does |putImageData(d, 0, 0)|, which ends up in
    mozilla::dom::CanvasRenderingContext2D::PutImageData_explicit.
    
(f) PutImageData_explicit() creates a new gfxImageSurface() and copies in the
    data. This is copy 4, and it is short-lived.

(g) PutImageData_explicit() then calls CreateSourceSurfaceFromData(), creating
    a cairo surface. This is copy 5, and it is short-lived. Copy 3 is now dead,
    and will be reclaimed by the next GC(?)

Each copy can easily be multiple MB, and in the worst case (which is probably
common) we have five copies of the data alive at once.

So, ways to improve it.

- In steps (d) and (e) we can split the image up into smaller chunks, and then
  create/put those chunks one at a time. This will make copies 3, 4 and 5 much
  smaller.

- In steps (a) and (b) it is possible to decode the data directly into the RGBA
  array, which would merge copies 1 and 2. However, it's not easy, due to the
  various kinds of colour encodings and the possibility of images being
  resized.

If we did both of those, we end up with one full-size copy, and three copies of
each chunk, but two of the copies of each chunk would be very short-lived.

Nicholas Nethercote [inactive]

Assignee

Comment 1

•

10 years ago

Hmm, typically you get bad memory consumption when scrolling quickly through a scanned document, i.e. when we're dealing with many pages in quick succession. No individual page is a problem, it's the handling of many pages that adds up; in particular, the longer-lived copies add up quickly. So chunking actually won't help much -- copies 4 and 5 are so short-lived they don't matter much, and copy 3 just gets split up into lots of smaller pieces, all of which hang around just as long as a single larger array would. (And in some ways they're worse.)

However, since an ImageData is basically just a wrapper around a Uint8ClampedArray, I wonder if it's possible to avoid making copy 3 by making copy 2 an ImageData object. Is an ImageData transferable from a worker? Alternatively, can you construct an ImageData from an existing Uint8ClampedArray?

Nicholas Nethercote [inactive]

Assignee

Comment 2

•

10 years ago

> copy 3 just gets split up into lots of smaller pieces

Actually, that's not right, because a single ImageData object can be reused for each piece.

> Is an ImageData transferable from a
> worker? Alternatively, can you construct an ImageData from an existing
> Uint8ClampedArray?

AFAICT, the answer is "no" in both cases :(

Masatoshi Kimura [:emk]

Comment 3

•

10 years ago

(In reply to Nicholas Nethercote [:njn] from comment #2)
> > copy 3 just gets split up into lots of smaller pieces
> 
> Actually, that's not right, because a single ImageData object can be reused
> for each piece.
> 
> > Is an ImageData transferable from a
> > worker? Alternatively, can you construct an ImageData from an existing
> > Uint8ClampedArray?
> 
> AFAICT, the answer is "no" in both cases :(

worker.postMessage(imageData, [imageData.data.buffer]); should work.

Nicholas Nethercote [inactive]

Assignee

Comment 4

•

10 years ago

> worker.postMessage(imageData, [imageData.data.buffer]); should work.

That will transfer the array buffer, but then we need to be able to create a new ImageData from an existing ArrayBuffer without doing copying, and that isn't possible.

Masatoshi Kimura [:emk]

Comment 5

•

10 years ago

Oh, we don't support ImageData constructor yet :(

Masatoshi Kimura [:emk]

Comment 6

•

10 years ago

Attached file imagedata_worker.html — Details

This code works on Chrome but doesn't on Firefox :(
Looks like we break imageData.data.length when we passed around the imageData.

Masatoshi Kimura [:emk]

Updated

•

10 years ago

Depends on: 959957

Masatoshi Kimura [:emk]

Updated

•

10 years ago

Depends on: 959958

Boris Zbarsky [:bzbarsky]

Comment 7

•

10 years ago

The imagedata_worker testcase seems to work just fine for me in a Mac 2014-01-11 nightly....

Masatoshi Kimura [:emk]

Comment 8

•

10 years ago

Ah, I tested it with Firefox 26. Indeed it worked in Nightly. Sorry for the confusion.

Masatoshi Kimura [:emk]

Comment 9

•

10 years ago

(In reply to Nicholas Nethercote [:njn] from comment #4)
> > worker.postMessage(imageData, [imageData.data.buffer]); should work.
> 
> That will transfer the array buffer, but then we need to be able to create a
> new ImageData from an existing ArrayBuffer without doing copying, and that
> isn't possible.

So we could pass around ImageData objects to workaround the absence of the ImageData constructor.

Nicholas Nethercote [inactive]

Assignee

Comment 10

•

10 years ago

> So we could pass around ImageData objects to workaround the absence of the
> ImageData constructor.

Ah, so we can transfer ImageData objects:

  worker.postMessage(imageData, [imageData.data.buffer]);

This means "copy the |imageData|, except transfer the |imageData.data.buffer| property". Nice.

However, if I instead take the approach of calling createImageData/putImageData on chunks of the pixel data, then I don't need to transfer the ImageData. And the chunk approach (a) also causes copy 4 and copy 5 to be smaller, and (b) doesn't rely on features that browsers might not implement reliably. So I will pursue the chunk approach. But thanks for the information!

Nicholas Nethercote [inactive]

Assignee

Comment 12

•

10 years ago

Here are three PDFs I've been using for testing. They're all scanned documents,
and so rely entirely on images.
- http://www.pdf-archive.com/2013/09/30/file2/file2.pdf
- http://pbadupws.nrc.gov/docs/ML0909/ML090920467.pdf
- http://ridl.cfd.rit.edu/products/manuals/Agilent/signal%20analyzer/Supplemental.pdf

Nicholas Nethercote [inactive]

Assignee

Comment 13

•

10 years ago

Attached patch Do createImageData/putImageData in chunks, to save memory. — Details — Splinter Review

This patch implements the chunking of the createImageData/putImageData. Each
chunk is 16 rows of the image. Referring to comment 0, it greatly reduces the
size of copies 3, 4 and 5. Of these, copy 3 is the most important, because it's
a JS allocation and so lives until the next GC (i.e. quite some time).

For example, if we have a 2000x2000 image (not untypical, and smaller than
many), a full copy will take 2000*2000*4 = 16,000,000 bytes. In comparison,
each chunk is only 2000*16*4 = 128,000 bytes, which is 125x smaller.

On a 3 page document (file2.pdf from comment 12), this reduces peak physical
memory when scrolling through the entire document from ~315 MiB to ~265 MiB.
The savings on bigger documents will be larger.

It also, believe it or not, is marginally faster than the old code. I measured
the time taken for the putBinaryImageData() function to execute. In a browser
that supports set() and subarray(), it improved the time from ~70ms to ~50ms.
In a browser that doesn't support those functions, the time was about the same.

Yury: if you are happy with this patch, would you mind again handling the pull
request on the github side, and testing? And please feel free to do any
additional performance testing.

Attachment #8360785 - Flags: review?(ydelendik)

Nicholas Nethercote [inactive]

Assignee

Updated

•

10 years ago

Assignee: nobody → n.nethercote

Status: NEW → ASSIGNED

Nicholas Nethercote [inactive]

Assignee

Updated

•

10 years ago

No longer depends on: 959958, 959957

Nicholas Nethercote [inactive]

Assignee

Comment 14

•

10 years ago

Attached patch Don't create the opacity buffer for images that lack a mask. — Details — Splinter Review

This patch avoids creating the opacity (A) array for images that don't have a
mask (which is the common case for scanned documents) by writing the opacity
values directly into the RGBA array. This avoids one quarter of copy 1 (from
comment 0).

I renamed |buf| to make it clear which buffer was the A buffer and which was
the RGBA one.

Attachment #8360874 - Flags: review?(ydelendik)

Yury Delendik (:yury)

Comment 15

•

10 years ago

Comment on attachment 8360874 [details] [diff] [review]
Don't create the opacity buffer for images that lack a mask.

Review of attachment 8360874 [details] [diff] [review]:
-----------------------------------------------------------------

Unable to run tests. Please submit as a pull request to the https://github.com/mozilla/pdf.js with review items addressed.

::: browser/extensions/pdfjs/content/build/pdf.worker.js
@@ +28178,3 @@
>        var smask = this.smask;
>        var mask = this.mask;
> +      var aBuf;

Unneeded renaming of the buf variable

@@ +28224,5 @@
> +      }
> +
> +      if (aBuf) {
> +        for (var i = 0, ii = width * actualHeight; i < ii; ++i) {
> +          rgbaBuf[4*i + 3] = aBuf[i];

No need to multiply by 4 here. Introduce e.g. j that will be incremented by 4 and started from 3.

Attachment #8360874 - Flags: review?(ydelendik) → feedback+

Yury Delendik (:yury)

Comment 16

•

10 years ago

Comment on attachment 8360785 [details] [diff] [review]
Do createImageData/putImageData in chunks, to save memory.

Review of attachment 8360785 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good. Unable to run tests. Please submit as a pull request to the https://github.com/mozilla/pdf.js for testing.

Attachment #8360785 - Flags: feedback+

Yury Delendik (:yury)

Updated

•

10 years ago

Attachment #8360785 - Flags: review?(ydelendik)

Nicholas Nethercote [inactive]

Assignee

Comment 17

•

10 years ago

> Unable to run tests.

What does that mean?

I setup a git clone and did the testing as described at https://github.com/mozilla/pdf.js/wiki/Contributing. It seems like the testing coverage for images isn't very good -- for the RGB changes I'm having to change getRgbBuffer for every sub-class of ColorSpace, and I have |throw| statements in the ones I haven't managed to test manually (which is most of them) and none of the test are hitting them. Am I missing something?

Nicholas Nethercote [inactive]

Assignee

Comment 18

•

10 years ago

> > +      var aBuf;
> 
> Unneeded renaming of the buf variable

As I mentioned in comment 14, we now have two buffers in that function: the RGBA buffer, and the A buffer. I'd like it to be clear which is which.

Yury Delendik (:yury)

Comment 19

•

10 years ago

(In reply to Nicholas Nethercote [:njn] from comment #18)
 > > Unneeded renaming of the buf variable
> 
> As I mentioned in comment 14, we now have two buffers in that function: the
> RGBA buffer, and the A buffer. I'd like it to be clear which is which.

Oh, I see 'a' stands for 'alpha' (thought it's an article)

(In reply to Nicholas Nethercote [:njn] from comment #17)
> I setup a git clone and did the testing as described at
> https://github.com/mozilla/pdf.js/wiki/Contributing. It seems like the
> testing coverage for images isn't very good -- for the RGB changes I'm
> having to change getRgbBuffer for every sub-class of ColorSpace, and I have
> |throw| statements in the ones I haven't managed to test manually (which is
> most of them) and none of the test are hitting them. Am I missing something?

Not sure. We have tests cases for the most of color spaces. Probably external PDF files or ref images are missing. Could you submit a pull request(s)?

Nicholas Nethercote [inactive]

Assignee

Comment 20

•

10 years ago

I did the chunking and skip-RGB-and-A approaches as pull requests:
https://github.com/mozilla/pdf.js/pull/4138
https://github.com/mozilla/pdf.js/pull/4139

Nicholas Nethercote [inactive]

Assignee

Comment 21

•

10 years ago

(In reply to Nicholas Nethercote [:njn] from comment #20)
> I did the chunking and skip-RGB-and-A approaches as pull requests:
> https://github.com/mozilla/pdf.js/pull/4138
> https://github.com/mozilla/pdf.js/pull/4139

And with those in place, in the common cases (no resizing, no opacity mask) we drop from 5 copies to approximately 1.03 copies of the data. (The 0.3 comes from the three slices that replace copies 3, 4 and 5.)

Ryan VanderMeulen [:RyanVM]

Updated

•

10 years ago

Depends on: 960051

Nicholas Nethercote [inactive]

Assignee

Comment 22

•

10 years ago

The chunking has been merged to pdf.js master: https://github.com/mozilla/pdf.js/pull/4138

Nicholas Nethercote [inactive]

Assignee

Comment 23

•

10 years ago

Hmm. Further measurements show that the above changes only make a small difference.

Consider http://pbadupws.nrc.gov/docs/ML0909/ML090920467.pdf, a scanned PDF with 226 pages, and each page is approximately 34,000,000 bytes when represented in 32-bit RGBA form.

Follow these steps:
- Start measuring resident memory consumption (RSS), e.g. via top.
- Open the PDF.
- Scroll to the document's end as fast as possible by holding down the PgDn key.
- Wait for all the pages to be decoded and rendered.

On my fast desktop Linux machine, the waiting takes approximately one minute. More relevantly for this bug, RSS gradually climbs in an ascending sawtooth pattern to about 7,800 MiB, whether or not the patches I've written for this bug are applied. And guess what? 226 * 34,000,000 is over 7,328 MiB.

In other words, even though I reduced five copies of each page's date down to 1, the four copies I removed were all short-lived, whereas the remaining copy is very long-lived.

I haven't yet worked out what is holding onto the arrays for each page, and whether it's deliberate or accidental. The data does get released eventually... if I wait a while, and then trigger "minimize memory usage" in about:memory, then RSS drops back to ~200 MB (but minimizing immediately doesn't do it). bdahl, do you know what controls this, and whether it's deliberate? I'm hoping that it's accidental, and that with some judicious nulling of some references that it'll become collectable much earlier.

Finally: evince is the native PDF viewer on Linux. When I use it to view the document, I can't get its RSS to exceed 150 MB. I don't expect pdf.js should be as memory-efficient as evince, but currently it's not even close -- 7,800 MiB for a 226 page document is ridiculous.

Flags: needinfo?(bdahl)

Nicholas Nethercote [inactive]

Assignee

Comment 24

•

10 years ago

So there is a mechanism to discard the imageData: PDFPageProxy.complete(), which calls _tryDestroy(). However, somehow when you scroll quickly the intra-thread messaging ordering occurs in such a way that none of the pages get to complete() until all the subsequent pages have been displayed. In fact, if you scroll slowly through the document -- e.g. one page per second, which is enough to render each page fully before moving to the next one -- the peak RSS is about 1 GB.

Nicholas Nethercote [inactive]

Assignee

Comment 25

•

10 years ago

Attached patch Ignore continueCallback. — Details — Splinter Review

I worked it out. The problem is with continueCallback. If defined, it gets
called when a page's image is received by the main thread. At this point, the
page is almost done, and we'd really like to finish processing the last few ops
in order to reach complete(), whereupon the page's pixel data can be freed.

However, viewer.js tries to be clever. It makes continueCallback (which is
called after every 15ms of operation processing) return early if the page being
rendered isn't the "most important" page, which I guess is the page that is
currently in the viewer. So we end up postponing the completion of every page
prior to the final page until the final page is rendered, but that doesn't
happen until all the previous pages have sent their image data to the main
thread, because the image data sending is in sequential page order.

The attached patch just ignores continueCallback. With it applied, I get a
peak RSS of about 880 MB instead of 7,800. As it stands the patch probably
isn't reasonable, but something needs to be done.

Nicholas Nethercote [inactive]

Assignee

Updated

•

10 years ago

Attachment #8362353 - Flags: feedback?(ydelendik)

Attachment #8362353 - Flags: feedback?(bdahl)

Nicholas Nethercote [inactive]

Assignee

Comment 26

•

10 years ago

> I worked it out. The problem is with continueCallback. 

Yury pointed me at https://github.com/mozilla/pdf.js/issues/1887, which covers this.

Nicholas Nethercote [inactive]

Assignee

Comment 27

•

10 years ago

The three patches landed (squashed into one):
https://github.com/mozilla/pdf.js/commit/f7e354dfe54b215c2539735701923c02908293f3

I filed bug 962351 for the scrolling issue in comment 25.

Nicholas Nethercote [inactive]

Assignee

Updated

•

10 years ago

Status: ASSIGNED → RESOLVED

Closed: 10 years ago

Flags: needinfo?(bdahl)

Resolution: --- → FIXED

Ryan VanderMeulen [:RyanVM]

Comment 28

•

10 years ago

Let's leave this open for bug 960051.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Nicholas Nethercote [inactive]

Assignee

Updated

•

10 years ago

Blocks: 949650

Yury Delendik (:yury)

Comment 29

•

10 years ago

Comment on attachment 8362353 [details] [diff] [review]
Ignore continueCallback.

See IRC discussion at 
http://logbot.glob.com.au/?c=mozilla%23pdfjs&s=20+Jan+2014&e=20+Jan+2014&h=continueCallback#c3303

Attachment #8362353 - Flags: feedback?(ydelendik)

Attachment #8362353 - Flags: feedback?(bdahl)

Nicholas Nethercote [inactive]

Assignee

Comment 30

•

10 years ago

This has now been imported (bug 960051).

Status: REOPENED → RESOLVED

Closed: 10 years ago → 10 years ago

Resolution: --- → FIXED

Ryan VanderMeulen [:RyanVM]

Updated

•

10 years ago

Target Milestone: --- → Firefox 29

imagedata_worker.html 10 years ago Masatoshi Kimura [:emk] 1020 bytes, text/html		Details
Do createImageData/putImageData in chunks, to save memory. 10 years ago Nicholas Nethercote [inactive] 3.42 KB, patch	yury : feedback+	Details \| Diff \| Splinter Review
Don't create the opacity buffer for images that lack a mask. 10 years ago Nicholas Nethercote [inactive] 5.79 KB, patch	yury : feedback+	Details \| Diff \| Splinter Review
Ignore continueCallback. 10 years ago Nicholas Nethercote [inactive] 1019 bytes, patch		Details \| Diff \| Splinter Review