Closed Bug 1264736 Opened 6 years ago Closed 3 years ago

crash in mozilla::gfx::DrawTargetD2D1::FinalizeDrawing when mCommandList is null

Categories

(Core :: Graphics, defect, P3)

x86
Windows NT
defect

Tracking

()

RESOLVED INACTIVE
Tracking Status
firefox-esr45 --- affected

People

(Reporter: milan, Unassigned)

Details

(Keywords: crash, Whiteboard: [gfx-noted])

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is 
report bp-22e54954-5aff-410e-8f24-21a952160413.
=============================================================

We had other FinalizeDrawing crashes before, and they have no turned into GetImageForLayerContent crashes instead; the one above is a different flavour, where the command list is null.
Do we know why we'd have mCommandList be null?
Flags: needinfo?(bas)
Whiteboard: [gfx-noted]
The only scenario I can think of is OOM.
Flags: needinfo?(bas)
Comment on attachment 8743025 [details]
MozReview Request: Bug 1264736: Crash sooner if we can't get a valid command list, at least in nightly/aurora. r?bas

https://reviewboard.mozilla.org/r/47525/#review44455

::: gfx/2d/DrawTargetD2D1.cpp:1211
(Diff revision 1)
>    mUsedCommandListsSincePurge++;
>  
> +  // This is where we should have a valid command list.  If we don't, something is
> +  // wrong, and it's likely an OOM.
> +  if (!mCommandList) {
> +    gfxDevCrash(LogReason::InvalidCommandList) << "Invalid D2D1.1 command list on creation " << mUsedCommandListsSincePurge;

Let's log the HRESULT from the Create call as well.
Attachment #8743025 - Flags: review?(bas) → review+
Comment on attachment 8743025 [details]
MozReview Request: Bug 1264736: Crash sooner if we can't get a valid command list, at least in nightly/aurora. r?bas

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/47525/diff/1-2/
Assignee: nobody → milan
Keywords: leave-open
The only post 46 crash with this signature was the original one.  Not sure how much we're going to catch with this change, but I'll keep watching.
Flags: needinfo?(milan)
Got one!  https://crash-stats.mozilla.com/report/index/883dd2ea-b8f0-49a5-94d5-9af822160426

Bas, you're right, this is an out of memory error when creating a command list (0x8007000e = E_OUTOFMEMORY)

We are also reporting 104327 as the value for mUsedCommandListsSincePurge, which suggests that we haven't called DrawTargetD2D1::Flush() recently, which would have reset that count to 0 if it went higher than the current constant 25.

Not knowing the code - should mDC->Flush() at https://dxr.mozilla.org/mozilla-central/source/gfx/2d/DrawTargetD2D1.cpp#1382 be just Flush()?
Flags: needinfo?(milan) → needinfo?(bas)
(In reply to Milan Sreckovic [:milan] from comment #9)
> Got one! 
> https://crash-stats.mozilla.com/report/index/883dd2ea-b8f0-49a5-94d5-
> 9af822160426
> 
> Bas, you're right, this is an out of memory error when creating a command
> list (0x8007000e = E_OUTOFMEMORY)
> 
> We are also reporting 104327 as the value for mUsedCommandListsSincePurge,
> which suggests that we haven't called DrawTargetD2D1::Flush() recently,
> which would have reset that count to 0 if it went higher than the current
> constant 25.
> 
> Not knowing the code - should mDC->Flush() at
> https://dxr.mozilla.org/mozilla-central/source/gfx/2d/DrawTargetD2D1.
> cpp#1382 be just Flush()?

Hrm, no, it actually shouldn't. How this bug could happen is by drawing to a Canvas and not doing anything with that canvas for a very long time (i.e. it hasn't been drawn to the screen, nor copied to another canvas or used in any other way). This seems silly, but it's not like we've never seen web authors do silly things before. We could insert an occasional flush in PrepareForDrawing if a lot of command lists have been used. It's a shame for performance but the impact shouldn't be huge and that way we would catch this crash.
Flags: needinfo?(bas)
OK, we can wait for a few more crashes, to see if all of them are coming from 2D canvas.  If it is just canvas, is the Flush() that we're missing the one in PreTransactionCallback?  Or the one in DrawWindow?
Flags: needinfo?(milan)
Crash volume for signature 'mozilla::gfx::DrawTargetD2D1::FinalizeDrawing':
 - nightly (version 50): 0 crashes from 2016-06-06.
 - aurora  (version 49): 0 crashes from 2016-06-07.
 - beta    (version 48): 0 crashes from 2016-06-06.
 - release (version 47): 1053 crashes from 2016-05-31.
 - esr     (version 45): 182 crashes from 2016-04-07.

Crash volume on the last weeks:
            W. N-1  W. N-2  W. N-3  W. N-4  W. N-5  W. N-6  W. N-7
 - nightly       0       0       0       0       0       0       0
 - aurora        0       0       0       0       0       0       0
 - beta          0       0       0       0       0       0       0
 - release     133     141     123     117     156     157     143
 - esr          14      15      20      20      24      18      20

Affected platform: Windows
https://crash-stats.mozilla.com/report/index/aa6d9769-ae87-46e4-82f2-1a3062160929#tab-metadata and https://crash-stats.mozilla.com/report/index/97fcc657-38a7-4911-8963-a84fa2160928#tab-metadata, for example, demonstrate this problem as mUsedCommandListsSincePurge goes into six digits.  Are we incorrectly resetting mUsedCommandListsSincePurge, or are we really accumulating 150k+ of them?

The failures are indeed OOM.
Flags: needinfo?(milan) → needinfo?(bas)
(In reply to Milan Sreckovic [:milan] from comment #13)
> https://crash-stats.mozilla.com/report/index/aa6d9769-ae87-46e4-82f2-
> 1a3062160929#tab-metadata and
> https://crash-stats.mozilla.com/report/index/97fcc657-38a7-4911-8963-
> a84fa2160928#tab-metadata, for example, demonstrate this problem as
> mUsedCommandListsSincePurge goes into six digits.  Are we incorrectly
> resetting mUsedCommandListsSincePurge, or are we really accumulating 150k+
> of them?
> 
> The failures are indeed OOM.

Alright, I'll work on the solution described in comment 10.
Flags: needinfo?(bas)
:nical, is your "needs flushing" bug related?
Flags: needinfo?(nical.bugzilla)
(In reply to Milan Sreckovic [:milan] from comment #15)
> :nical, is your "needs flushing" bug related?

Mine is on the compositor side so not directly related although it is a similar kind of story.
Flags: needinfo?(nical.bugzilla)
Priority: -- → P3
Assignee: milaninbugzilla → nobody
The leave-open keyword is there and there is no activity for 6 months.
:davidb, maybe it's time to close this bug?
Flags: needinfo?(dbolter)
I think we have about 7 other bugs open related to canvas and OOM scenarios so I'll close this one inactive. Reopen if necessary.
Status: NEW → RESOLVED
Closed: 3 years ago
Flags: needinfo?(dbolter)
Resolution: --- → INACTIVE
You need to log in before you can comment on or make changes to this bug.