Closed Bug 1096913 Opened 5 years ago Closed 5 years ago

gfx-d2d-vram-source-surface leak drawing image to canvas

Categories

(Core :: Graphics, defect, major)

x86
Windows 8.1
defect
Not set
major

Tracking

()

VERIFIED FIXED
mozilla36
Tracking Status
firefox34 + verified
firefox35 + verified
firefox36 + verified
b2g-v2.1 --- fixed
b2g-v2.2 --- fixed

People

(Reporter: Hughman, Assigned: bas.schouten)

References

Details

(Whiteboard: [MemShrink:P1])

Attachments

(2 files)

Attached file Testcase
So here is a testcase which makes gfx-d2d-vram-source-surface and gpu-committed go up and never go down. I found it when windows started force closing applications to free memory, which for me was when about 20GB (size of pagefile.sys) was committed. At the bottom is part of about:memory.

This testcase was pulled out of another more complex little application that I was reviving so it was easy to locate the cause. In this simplified version I draw the same image 1000 times after reloading it each time.

What causes the leak seems to be a combination of reusing an Image object for loading multiple images (testcase reduced to reloading one) and then putting it on a canvas using drawImage.
- Drawing the image multiple times does not cause issue.
- Using a new Image for each load also does not cause issue.
- Memory is not freed until process is ended (tried minimize memory, closing tab, etc).
- You can get a high value like me by simply reloading the page a few times.

Also be aware that (at least my version of Process Explorer) the GPU Committed value has a 32bit integer issue. The one in about:memory gives the real value for me.


     3.00 MB ── canvas-2d-pixels
     0.00 MB ── gfx-d2d-surface-cache
     4.00 MB ── gfx-d2d-surface-vram
     3.00 MB ── gfx-d2d-vram-draw-target
 9,972.95 MB ── gfx-d2d-vram-source-surface
     0.10 MB ── gfx-surface-win32
     0.00 MB ── gfx-textures
     0.00 MB ── gfx-tiles-waste
           0 ── ghost-windows
10,735.77 MB ── gpu-committed
   743.53 MB ── gpu-dedicated
    18.24 MB ── gpu-shared
    45.00 MB ── heap-allocated
          79 ── heap-chunks
     1.00 MB ── heap-chunksize
    47.65 MB ── heap-committed
    79.00 MB ── heap-mapped
       5.88% ── heap-overhead-ratio
           1 ── host-object-urls
     0.00 MB ── imagelib-surface-cache
     0.31 MB ── js-main-runtime-temporary-peak
           0 ── low-commit-space-events
   178.18 MB ── private
   231.70 MB ── resident
   705.79 MB ── vsize
 1,805.94 MB ── vsize-max-contiguous
Also I would recommend that you do not run this test case in the following two conditions:

1. In you normal browser. OOM crashes are nasty and I lost my profile.

2. If you have a pagefile.sys that is forced smaller than 3GB even if you have 16GB of RAM. Windows will most likely make your life hell by closing everything but Firefox to free some commit space. This leak doesn't use RAM, it uses pagefile.
This sounds a bit like bug 1082827 and/or bug 1081926. mwu, any thoughts?
Flags: needinfo?(mwu)
There's a bug, I can reproduce it, I'll work on fixing it.
Assignee: nobody → bas
Status: NEW → ASSIGNED
Flags: needinfo?(bas)
This is a regression from bug 1081926. Fix upcoming.
Blocks: 1081926
[Tracking Requested - why for this release]:

[Tracking Requested - why for this release]:

Bug 1081926 is on beta and b2g v2.1, requesting tracking.
Attachment #8520851 - Flags: review?(mwu) → review+
Whiteboard: [MemShrink] → [MemShrink:P1]
Beta is almost out the door - Lawrence will be needing a good breakdown of risk v. reward on taking this for our final beta.
Is a backout of bug 1081926 possible for 34?
(In reply to Lukas Blakk [:lsblakk] use ?needinfo from comment #8)
> Is a backout of bug 1081926 possible for 34?

Bug 1081926 was fixing an issue about as bad as this one, so I'm not sure that would be a win.
https://hg.mozilla.org/mozilla-central/rev/49d26513f6be
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla36
Comment on attachment 8520851 [details] [diff] [review]
Remove the simple cache entry when replacing the cache entry

Approval Request Comment
[Feature/regressing bug #]: Bug 1081926
[User impact if declined]: Runaway memory usage for some canvas usage patterns
[Describe test coverage new/current, TBPL]: Very limited, green on central
[Risks and why]: Low, we might remove a cache entry we shouldn't, in which case there might be a slight perf regression, but not much else. Technically a null-deref from mRequest could occur, but that seems extremely unlikely.
[String/UUID change made/needed]: None
Attachment #8520851 - Flags: approval-mozilla-beta?
Attachment #8520851 - Flags: approval-mozilla-aurora?
Attachment #8520851 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
I spoke with Bas about this fix. He's not 100% confident in the fix but thinks the risk is worth it as the bug is pretty bad.

Bas - Can you verify the fix is good in today's Nightly build build (tomorrow morning) before uplifting to Beta?
Flags: needinfo?(bas)
Flags: needinfo?(mwu)
Comment on attachment 8520851 [details] [diff] [review]
Remove the simple cache entry when replacing the cache entry

I have asked QE to verify but am going to approve landing to Beta now so that we can gtb today. I do want to verify that this fix is good before shipping.
Attachment #8520851 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
(In reply to Lawrence Mandel [:lmandel] (use needinfo) from comment #12)
> I spoke with Bas about this fix. He's not 100% confident in the fix but
> thinks the risk is worth it as the bug is pretty bad.
> 
> Bas - Can you verify the fix is good in today's Nightly build build
> (tomorrow morning) before uplifting to Beta?

It looks good to me, as far as I'm concerned we can land it.
Flags: needinfo?(bas)
Bas - Can you please land this on Beta yourself as I don't think we have sheriff coverage right now and really need to land the fix before 5pm ET.
Flags: needinfo?(bas)
Hugh, I'm trying to reproduce this in Windows 8.1 with the current Nightly and not having any luck. Is it specific to particular hardware? What exactly should we be doing to reproduce this bug? Thanks!
Flags: needinfo?(hughnougher)
I just did some testing on Win7 (what I had) using Nightly from 12th and 13th with testcase.

The Nightly from 12th reproduced the issue with gpu commit going up by about 1gb per page reload.
The Nightly from 13th had no problem and the testcase completed much quicker.

I don't think there is specific hardware involved.
Flags: needinfo?(hughnougher)
Hugh: thank you for the excellent bug report. This was a bad bug and it's great that it's has been fixed. Creating the small test case was particularly helpful.
Flags: qe-verify+
Reproduced the bug on Nightly 36.0a1 (2014-11-12) using Windows 8.1 64-bit, by comparing the values of 'gfx-d2d-vram-source-surfac'e and 'gpu-committed' from the testcase, after each page reload. 

This is now verified fixed on:
 * Nightly 36.0a1 (2014-11-13), 
 * Firefox 34.0b9 (build1 / 20141113192814),
using Windows 8.1 64-bit, Windows 7 64-bit.
Status: RESOLVED → VERIFIED
Verified fixed on Aurora 35.0a2 (2014-11-23) as well, using Windows 7 64-bit and Windows 8.1 64-bit.
You need to log in before you can comment on or make changes to this bug.