Closed
Bug 878397
Opened 11 years ago
Closed 11 years ago
pdf.js: High "explicit" (though low "resident") due to lots of inefficient mask arrays
Categories
(Firefox :: PDF Viewer, defect, P3)
Tracking
()
VERIFIED
FIXED
Firefox 29
Tracking | Status | |
---|---|---|
firefox29 | --- | verified |
People
(Reporter: augustmiles, Assigned: n.nethercote)
References
Details
(Whiteboard: [pdfjs-c-performance][pdfjs-f-fixed-upstream] https://github.com/mozilla/pdf.js/pull/4122)
Attachments
(2 files, 2 obsolete files)
33.25 KB,
application/octet-stream
|
Details | |
6.46 KB,
patch
|
yury
:
review+
|
Details | Diff | Splinter Review |
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:22.0) Gecko/20100101 Firefox/22.0 (Beta/Release) Build ID: 20130528181031 Steps to reproduce: 1) Start firefox, so memory usage is low to begin 2) open "about:memory" 3) in a new tab: Open the small pdf file http://www.henle.de/media/review/0015.pdf (600kb) 4) switch to memory tab and update. This is a comment on a Bach score and has a mixture of text and musical notation more than 1GB of memory are used to render the page. If I watch the activity monitor memory usage will sometimes stay high for several minutes, rending a Mac unusable for several minutes- impossible to swtich applications, even open the "force quite" window. Actual results: Machine becomes non responsive, beach balls, go to "about:memory" 1,727.82 MB (100.0%) -- explicit ├──1,247.53 MB (72.20%) -- window-objects │ ├──1,211.97 MB (70.14%) -- top(http://www.henle.de/media/review/0015.pdf, id=25) │ │ ├──1,205.29 MB (69.76%) -- active/window(http://www.henle.de/media/review/0015.pdf) │ │ │ ├──1,198.55 MB (69.37%) -- js-compartment(resource://pdf.js/web/viewer.html, http://www.henle.de/media/review/0015.pdf) │ │ │ │ ├──1,185.93 MB (68.64%) -- objects-extra │ │ │ │ │ ├──1,185.61 MB (68.62%) ── elements/non-asm.js │ │ │ │ │ └──────0.32 MB (00.02%) ── slots │ │ │ │ └─────12.62 MB (00.73%) ++ (6 tiny) │ │ │ └──────6.74 MB (00.39%) ++ (4 tiny) │ │ └──────6.68 MB (00.39%) ++ js-zone(115984800) │ ├─────18.63 MB (01.08%) ++ top(http://www.henle.de/en/detail/index.html?Title=The+Well-Tempered+Clavier+Part+I+BWV+846-869_15, id=14) │ └─────16.94 MB (00.98%) ++ (4 tiny) 1,337.90 MB (100.0%) -- js-main-runtime ├──1,252.50 MB (93.62%) -- compartments │ ├──1,192.87 MB (89.16%) -- objects-extra │ │ ├──1,186.50 MB (88.68%) ── elements/non-asm.js │ │ └──────6.37 MB (00.48%) ++ (2 tiny) │ ├─────45.46 MB (03.40%) -- gc-heap │ │ ├──24.40 MB (01.82%) ++ objects │ │ ├──17.00 MB (01.27%) ++ shapes │ │ └───4.07 MB (00.30%) ── scripts │ └─────14.17 MB (01.06%) ++ (10 tiny) ├─────64.76 MB (04.84%) -- zones │ ├──54.79 MB (04.10%) -- gc-heap │ │ ├──27.10 MB (02.03%) -- strings │ │ │ ├──19.62 MB (01.47%) ── short │ │ │ └───7.48 MB (00.56%) ── normal │ │ ├──25.22 MB (01.89%) ── unused-gc-things │ │ └───2.47 MB (00.18%) ++ (3 tiny) │ └───9.97 MB (00.75%) ++ (3 tiny) ├─────18.04 MB (01.35%) ── runtime └──────2.60 MB (00.19%) ++ gc-heap Expected results: Much lower memory usage as per chrome/preview or other pdf viewers.
Comment 1•11 years ago
|
||
I tested and the initial memory usage was indeed high in Nightly. I only got it as high as ~500MB. Waited a bit, measured again and it dropped to ~120MB. Scrolled through the entire document then it dropped to ~58MB. Tested again: * Starts low. * Grows as high as ~590MB * Waited 20 seconds without having the pdf in focus * Measured it again at ~20MB * Scrolled through the document and measured at ~104MB * Waited for GC and memory usage ends at ~58MB. Is this the same behavior you're experiencing or memory usage doesn't drop?
Updated•11 years ago
|
Component: Untriaged → PDF Viewer
loic- The memory dump is only in firefox23, so I moved up from 22 to 23. Jesper- if I have the about:memory tab open the memory goes down after about 20/30 seconds. If however I just open the pdf and scroll then the "activity montitor" (ie top) gives me memory going up to 1.3GB. I then leave firefox and memory remains at 1.3G for many minutes. The machine is then dreadfully unresponsive- beachballing when using other programs. As soon as I open a new tab the memory drops back to a few hundred mega in top. Thus to reproduce the problem you must open the about:memory tab before reading the pdf, otherwise the new tab seems to trigger a memory event.
Updated•11 years ago
|
Priority: -- → P3
Whiteboard: [MemShrink] → [MemShrink][pdfjs-c-performance]
Assignee | ||
Updated•11 years ago
|
Whiteboard: [MemShrink][pdfjs-c-performance] → [pdfjs-c-performance]
Comment 5•11 years ago
|
||
I have seen similar memory usage for small PDFs consisting of scanned images. I can upload a sample if needed.
Comment 6•11 years ago
|
||
I am also seeing extremely high memory usage of 1GB or more when viewing PDF files of scanned images. See further details at https://github.com/mozilla/pdf.js/issues/3630. Please let me know if I can provide more information to help resolve this issue.
Assignee | ||
Comment 7•11 years ago
|
||
I've started investigating this. The first thing I found is something odd in about:memory's numbers: 1,152.41 MB (100.0%) -- explicit ├──1,022.56 MB (88.73%) -- window-objects │ ├────954.06 MB (82.79%) -- top(file:///home/njn/bach.pdf, id=8) │ │ ├──948.90 MB (82.34%) -- active/window(file:///home/njn/bach.pdf) │ │ │ ├──942.93 MB (81.82%) -- js-compartment(resource://pdf.js/web/viewer.html, file:///home/njn/bach.pdf) │ │ │ │ ├──942.14 MB (81.75%) -- objects │ │ │ │ │ ├──934.63 MB (81.10%) -- malloc-heap │ │ │ │ │ │ ├──932.89 MB (80.95%) ── elements/non-asm.js ... 509.42 MB ── resident 509.98 MB ── resident-unique 1,868.29 MB ── vsize It's *really* strange to see "resident" so much smaller than "explicit"; they're normally much closer. My guess is we're allocating a really big array but never touching most of it, so it doesn't get page into physical memory. ---- I added some diagnostic printfs to jemalloc and found that we're doing hundreds of allocations of three sizes: 13.77 MiB, 11.21 MiB, and 19.89 MiB. With gdb I determined these were due to typed array allocations, and the typed arrays had 1-byte elements (i.e. Int8Array or Uint8Array or Uint8ClampedArray). I then added diagnostic printfs to pdf.js and pdf.worker.js and discovered that PDFImage.createMask() in pdf.worker.js is entirely responsible for these. Also, there are four distinct calling patterns, each of which happens multiple times: ( 1) 21 (47.7%, 47.7%): createMask: h=2833, w=1037 ( 2) 18 (40.9%, 88.6%): createMask: h=2550, w=1416 ( 3) 3 ( 6.8%, 95.5%): createMask: h=2741, w=1037 ( 4) 2 ( 4.5%,100.0%): createMask: h=3683, w=1416 These correspond to the seen amounts, if we do h*w*4. Now, the only caller of PDFImage.createMask is this code: operatorList.addOp(OPS.paintImageMaskXObject, [PDFImage.createMask(imgArray, width, height, inverseDecode)] ); Just for a laugh I replaced the second argument with []. I thought it would totally bomb but the PDF actually renders fairly well -- the text looks fine but the musical notation is missing, and the load throbber never stops spinning. Intriguing, nonetheless.
Assignee | ||
Comment 8•11 years ago
|
||
AFAICT, PDFImage_createMask takes an array that encodes 1 bit of (transparency?) information per pixel of an image, and converts that into an array that is 32x bigger. E.g. we take something that's 0.43MB and re-encode it so it takes 13.77 MB. This is extraordinarily inefficient, and surely we can do something smarter.
Assignee | ||
Comment 9•11 years ago
|
||
> Just for a laugh I replaced the second argument with []. I thought it would
> totally bomb but the PDF actually renders fairly well -- the text looks fine
> but the musical notation is missing, and the load throbber never stops
> spinning. Intriguing, nonetheless.
I forgot to mention that the memory usage was drastically lower when I tried this.
Assignee | ||
Comment 11•11 years ago
|
||
BTW, the file from bug 839548 (which I determined to be a dup) is here: http://innsyn2.e-kommune.no/innsyn_jolster/wfdocument.aspx?journalpostid=2013000505&dokid=37355&versjon=1&variant=A&ct=RA-PDF
Assignee | ||
Updated•11 years ago
|
Summary: More the 1GB of memory for a small pdf file: pdf.js → pdf.js: High "explicit" (though low "resident") due to lots of inefficient mask arrays
Updated•11 years ago
|
Status: UNCONFIRMED → NEW
Ever confirmed: true
Assignee | ||
Comment 12•11 years ago
|
||
As best as I can tell: - After PDFImage_createMask expands the compact array into the non-compact array, it puts it onto the operator list which eventually gets passed from the worker to the main thread (implemented in pdf.js) via a 'RenderPageChunk' message. - And then it ends up being passed to CanvasGraphics_paintImageMaskXObject(), and then onto putBinaryImageData(). - putBinaryImageData() then creates an ImageData object (tmpImgDataPixels) and copies the non-compact array's data into it, and then calls putImageData() on tmpImgDataPixels. One idea is to delay the expansion, and instead pass the compact array to the main thread, and then unpack it directly into the ImageData object. I.e. the non-compact array is just an inefficient intermediate, so remove it altogether. However, that requires also passing |inverseDecode|. Another idea is to create the ImageData object immediately and pass it to the main thread, instead of passing the non-compact array.
Assignee | ||
Comment 13•11 years ago
|
||
> Another idea is to create the ImageData object immediately and pass it to
> the main thread, instead of passing the non-compact array.
Hmm, but creating the ImageData requires access to the maskCanvas, which the worker doesn't have.
Assignee | ||
Comment 14•11 years ago
|
||
> One idea is to delay the expansion, and instead pass the compact array to
> the main thread, and then unpack it directly into the ImageData object.
Hmm, but IIRC to do this the compact array must be a TypedArray.
Assignee | ||
Comment 15•11 years ago
|
||
Here's a rough draft of a patch that copies the compact alpha data into a typed array, so it can be passed from the worker to the main thread, and expands it directly into the ImageData object on the main thread. The two documents (see comment 0 and comment 11) I've been testing appear to display correctly. My rough measurements show that it reduces "explicit" memory usage greatly, though I'm out of time today to do precise measurements. I don't really have any idea what I'm doing, so please let me know if this seems reasonable.
Assignee | ||
Updated•11 years ago
|
Assignee: nobody → n.nethercote
Status: NEW → ASSIGNED
Assignee | ||
Updated•11 years ago
|
Attachment #8359053 -
Flags: feedback?(bdahl)
Assignee | ||
Comment 16•11 years ago
|
||
I've done some measurements.
Assignee | ||
Comment 17•11 years ago
|
||
> I've done some measurements.
... and here they are.
First, my process.
- Start Firefox
- Start |peak-rss|, a script which measures RSS ten times a second
- Load the PDF, scroll quickly to the bottom then back to the top
- Open about:memory and hit measure, and look at the "elements/non-asm.js"
entry for the pdf.js compartment
0015.pdf
--------
RSS-peak JS-elements-peak
before ~ 650 MiB ~1100 MiB
after ~ 540 MiB ~ 400 MiB
37355_1_A.pdf
-------------
RSS-peak JS-elements-peak
before ~ 800 MiB ~800 MiB
after ~2400 MiB ~1000 MiB
The JS-elements-peak results are excellent. The RSS-peak results are less
impressive. It's weird that the non-compact arrays don't contribute much to
RSS... even if big chunks of them aren't written to, when they are read I would
expect them to page in. Maybe the Linux kernel is being really clever and is
managing to not page in all-zero pages when they are read. But this behaviour
might not occur on other platforms.
bdahl suggested that the non-compact array is created in the worker thread to
avoid janking the main thread, which complicates things. We might be able to
speed up the unpacking loop by handling 8 bits at a time, since it seems common
that all 8 bits will have the same value.
Assignee | ||
Comment 18•11 years ago
|
||
Oh, this is interesting: I tried flipping inverseDecode just to see what happens. Various images appear inverted (white-on-black instead of black-on-white). Also, the peak-RSS for 0015.pdf (without my patch) jumps to 1200 MiB and for 37355_1_A.pdf is jumps to 2400 MiB. So my theory about the kernel optimizing zero pages seems correct.
Assignee | ||
Comment 19•11 years ago
|
||
This works well on the cases I've been testing. Yury, can you please take this over? I don't know how to test and land pdf.js patches. Thanks!
Attachment #8359544 -
Flags: review?(ydelendik)
Assignee | ||
Updated•11 years ago
|
Attachment #8359053 -
Attachment is obsolete: true
Attachment #8359053 -
Flags: feedback?(bdahl)
Assignee | ||
Comment 20•11 years ago
|
||
BTW, when/how do pdf.js updates get imported into Firefox? I see various bugs like bug 947260 for updating -- when do they occur? This is a pretty big memory win, it'd be great to get it into Firefox quickly once it lands in the git repo.
Comment 21•11 years ago
|
||
When I feel like it or when I'm asked. Given that I'm following the progress on this bug, I'll be sure to get a patch up as soon as it's landed :)
Assignee | ||
Comment 22•11 years ago
|
||
Oh, wait! I was making the new array 8x bigger than necessary, because I computed its length as width * height, even though we only need 1 bit per pixel. With this change, I'm struggling to get peak-RSS higher than ~400 for both files. Cool! (BTW, in comment 17 I accidentally switched the before and after lines for 37355_1_A.pdf.)
Attachment #8359574 -
Flags: review?(ydelendik)
Assignee | ||
Updated•11 years ago
|
Attachment #8359544 -
Attachment is obsolete: true
Attachment #8359544 -
Flags: review?(ydelendik)
Assignee | ||
Comment 23•11 years ago
|
||
> (BTW, in comment 17 I accidentally switched the before and after lines for
> 37355_1_A.pdf.)
Actually, I totally botched the results for 37355_1_A.pdf. Here are new results for both files:
0015.pdf
--------
RSS-peak JS-elements-peak
before ~650 MiB ~1100 MiB
after ~400 MiB ~ 200 MiB
37355_1_A.pdf
-------------
RSS-peak JS-elements-peak
before ~800 MiB ~2400 MiB
after ~400 MiB ~ 500 MiB
Pretty good! Shrinking those big arrays by 32x really makes a difference.
Assignee | ||
Comment 24•11 years ago
|
||
Even with the fix, there's still some inefficiency. For 0015.pdf, the vast majority of the typed array allocations remaining follow this pattern: - buildPaintImageXObject() calls DecodeStream_getBytes() with a length of 451704. - It calls DecodeStream.ensureBuffer, which rounds that up to the nearest power of two (524288), and allocates a Uint8Array of that rounded-up size. - buildPaintImageXObject() calls createMask(), which allocates a Uint8Array of length 451704 and copies (and possibly inverts) the data from the first UInt8Array into it. So, two inefficiencies: - The rounding up to a power of two isn't necessary. - It would be even better if we could decode (and possibly invert) directly into the second typed array. But this seems tricky, because there are several kinds of streams and we in some of them the first typed array might not be used.
Assignee | ||
Comment 25•11 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #21) > When I feel like it or when I'm asked. Given that I'm following the progress > on this bug, I'll be sure to get a patch up as soon as it's landed :) Yury landed in on github: https://github.com/yurydelendik/pdf.js/commit/3f533a1cb04adbf422d58ce71eed6d4e851611ee I think that's all we need to do here.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Comment 26•11 years ago
|
||
Hasn't actually been committed to the pdf.js repo yet. I assume Yury will submit a pull request for it soon, though :)
Comment 27•11 years ago
|
||
Will resolve this when it's merged to m-c from upstream.
Status: RESOLVED → REOPENED
Flags: needinfo?(ryanvm)
Resolution: FIXED → ---
Comment 28•11 years ago
|
||
Comment on attachment 8359574 [details] [diff] [review] Use a more compact typed array to pass the image mask from the worker to the main thread. Looks good. The nbytes parameter can be calculated in the createMask function, changed and landed as https://github.com/yurydelendik/pdf.js/commit/3f533a1cb04adbf422d58ce71eed6d4e851611ee
Attachment #8359574 -
Flags: review?(ydelendik) → review+
Updated•11 years ago
|
Whiteboard: [pdfjs-c-performance] → [pdfjs-c-performance][pdfjs-f-fixed-upstream] https://github.com/mozilla/pdf.js/pull/4122
Updated•11 years ago
|
Flags: needinfo?(ryanvm)
Assignee | ||
Comment 29•11 years ago
|
||
Bug 960051 merged this into m-c.
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Target Milestone: --- → Firefox 29
Comment 30•11 years ago
|
||
Verified on Beta 29.0b6 (Build ID: 20140407135746) on Mac OS X 10.9.2 using STR from comment 17 and a memory usage script to get the RSS peak. Comparison in memory usage was done against the Firefox 29 Nightly build from January 11. - RSS-peak - drops by ~500 MB for both PDFs - JS-elements-peak - 0015.pdf - drops from ~1100 Mb (Jan 11 Nightly) to ~250 Mb (29.0b6) - 37355_1_A.pdf - drops from ~2200 Mb (Jan 11 Nightly) to ~500 Mb (29.0b6)
Status: RESOLVED → VERIFIED
Keywords: verifyme
status-firefox29:
--- → verified
You need to log in
before you can comment on or make changes to this bug.
Description
•