Open Bug 869252 Opened 11 years ago Updated 2 years ago

CreateOffscreenSurface seems to have serious bugs on Windows 7 that CreateOffscreenDrawTarget does not have

Categories

(Core :: Graphics, defect)

All
Windows 7
defect

Tracking

()

People

(Reporter: seth, Unassigned)

Details

I encountered what appears to be a serious bug while working on bug 859377.

My initial series of patches to fix that bug called CreateOffscreenSurface. Here are Talos results from that patch stack:

https://tbpl.mozilla.org/?tree=Try&rev=c9a5cca8b316

Notice the tp5o_shutdown_paint score: 2633.00. This is a massive regression. Here's a baseline from before that patch stack:

https://tbpl.mozilla.org/?tree=Try&rev=cfe7add46123

The tp5o_shutdown_paint score is 1288.00, so the score more than doubled after that patch stack.

The root cause seemed to be the use of CreateOffscreenSurface. When I switch to CreateOffscreenDrawTarget, things get much better:

https://tbpl.mozilla.org/?tree=Try&rev=4fa753389953

Here tp5o_shutdown_paint gets as low as 1085.00.

What's really troubling is that this regression in the tp5o_shutdown_paint score was correlated with reports of a severe spike in crashes with empty dumps on Windows, which bsmedberg diagnosed as being caused by running out of address space. (Despite not having allocated that much memory!) See his analysis here:

http://benjamin.smedbergs.us/blog/2013-04-11/graph-of-the-day-firefox-virtual-memory-plot/

I switched to gfxImageSurfaces on Windows only, and that eliminated the spike, as reported by Scoobidiver in bug 866526 comment 12.

I'm not sure exactly what's going on here. Frustratingly, I haven't been able to reproduce the crash myself; the change in tp5o_shutdown_paint is the only proxy I have to diagnose this issue. It looks like there's something bad going on here, though, and given that we use CreateOffscreenSurface a fair amount I think it's worth looking into more.
What is the functional difference is between CreateOffscreenDrawTarget and CreateOffscreenSurface?

Note that the issues are likely but not certainly graphics-driver specific. We could certainly implement a tool which measures VM usage and fragmentation during TP5 (it wouldn't be hard, I don't think). But I'm not sure whether the problem would even show up on tinderbox.
Whiteboard: [MemShrink]
(In reply to Seth Fowler [:seth] from comment #0)
> I'm not sure exactly what's going on here. Frustratingly, I haven't been
> able to reproduce the crash myself; the change in tp5o_shutdown_paint is the
> only proxy I have to diagnose this issue.

Those patches also regressed "Main RSS" and "Private Bytes" counters on Windows, almost 50% for Private Bytes on Windows 7:
http://perf.snarkfest.net/compare-talos/index.html?oldRevs=07e17dd7813b&newRev=c9a5cca8b316&submit=true
https://groups.google.com/d/msg/mozilla.dev.tree-management/Y-rq_aDTdpU/dxgp75oTxGwJ

Can you reproduce those memory usage regressions locally?  Could they be symptomatic of the same address space problem that caused the crash?
I have not been able to reproduce anything locally.
Whiteboard: [MemShrink] → [MemShrink:P1]
The memory bug is fixed (see bug 866526 comment 12.) Removing memshrink tag but leaving this crash bug open.
Whiteboard: [MemShrink:P1]
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.