Closed Bug 1297568 Opened 3 years ago Closed 3 years ago

Fix random asserts when the GPU process dies

Categories

(Core :: Graphics: Layers, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla52
Tracking Status
firefox52 --- fixed

People

(Reporter: dvander, Assigned: mattwoodrow, NeedInfo)

References

(Blocks 1 open bug)

Details

(Whiteboard: [gfx-noted])

Attachments

(5 files)

Whiteboard: [gfx-noted]
Assignee: nobody → gwright
Assignee: gwright → matt.woodrow
With this patch and the ones from bug 1305361, we can pretty consistently crash the GPU process and have the content process survive.
Attachment #8795572 - Flags: review?(dvander)
Comment on attachment 8795572 [details] [diff] [review]
remove-assertions

Review of attachment 8795572 [details] [diff] [review]:
-----------------------------------------------------------------

I should probably have uploaded my own patches which are basically the same as this, but I had replaced the gfxCriticalError() calls with gfxCriticalNote. Afaik, gfxCriticalNote will still log in about:support and crash reports, but will not assert. I think there's still value in having the errors logged if the GPU process goes away.
Those were my patches; I'd erred more on the side of keeping asserts if we can, and keeping logging if we can.
Attachment #8795572 - Flags: review?(dvander) → review+
Attachment #8795582 - Flags: review?(dvander)
Attachment #8795583 - Flags: review?(dvander)
Attachment #8795584 - Flags: review?(dvander)
Attachment #8795585 - Flags: review?(dvander)
Attachment #8795585 - Flags: review?(dvander) → review+
Comment on attachment 8795584 [details] [diff] [review]
0003-Bug-1297568-Downgrade-an-IPC-channel-error-from-gfxC.patch

Review of attachment 8795584 [details] [diff] [review]:
-----------------------------------------------------------------

The IPC errors I'd rather just remove than downgrade. With the GPU process, we're assuming they can fail and we'll recover, so no need to have extra warning spam.
Attachment #8795584 - Flags: review?(dvander)
Comment on attachment 8795583 [details] [diff] [review]
0002-Bug-1297568-Don-t-begin-a-layer-transaction-if-our-I.patch

Review of attachment 8795583 [details] [diff] [review]:
-----------------------------------------------------------------

This seems reasonable but Matt knows this code better than I.
Attachment #8795583 - Flags: review?(dvander) → review?(matt.woodrow)
Comment on attachment 8795583 [details] [diff] [review]
0002-Bug-1297568-Don-t-begin-a-layer-transaction-if-our-I.patch

Review of attachment 8795583 [details] [diff] [review]:
-----------------------------------------------------------------

This seems fine, but I don't think it actually gains us too much.

It would be nice if we had a return value here to indicate failure so that the caller could abandon painting entirely at this point. I understand not wanting to add that right now though.

I think we still need the change to ShadowLayers.cpp I had in my patch too btw.
Attachment #8795583 - Flags: review?(matt.woodrow) → review+
Comment on attachment 8795572 [details] [diff] [review]
remove-assertions

Review of attachment 8795572 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/ipc/ShadowLayers.cpp
@@ +691,5 @@
>    }
>  
>    AutoTArray<Edit, 10> cset;
>    size_t nCsets = mTxn->mCset.size() + mTxn->mPaints.size();
> +  if (nCsets == 0 && !mTxn->RotationChanged()) {

Just explaining this change. mTxn is considered 'empty' if it has no mutants, no changesets and no paints. We return early from this function if the transaction is empty.

The code just before this return processes the mutants in the transaction, and adds a changeset (OpSetLayerAttributes) for each one.

The existing assertion here was checking that we now had at least one of (changesets + paints), since that should have been guaranteed by processing all mutants.

This process is now fallible (since we might have failed to create the Shadow for a layer), so the assertion can fail.
Pushed by gwright@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e71083772164
Don't begin a layer transaction if our IPC channel is down r=mattwoodrow
https://hg.mozilla.org/integration/mozilla-inbound/rev/724705bf2482
Downgrade buffer creation failures if we're a reasonable size, and never assert r=dvander
Pushed by mwoodrow@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e7d22a910d1b
Remove some invalid assertions that can happen when the GPU process crashes. r=dvander
Some of these assertions or crash reporter messages are very valid and required if we're *not* in the GPU world.  By removing them for everybody, we're missing information in the cases where we could use it.

Please revisit these and understand which ones are not required at all, and which ones are just not required when we're in the GPU process world.

For example, we lost a diagnostic message on a "180 crashes per week" open bug 1088300 because that message got removed in this bug.  Chances are, there are more like that.

Separate bug is fine, but this needs to be revisited.
Flags: needinfo?(matt.woodrow)
Flags: needinfo?(gwright)
Flags: needinfo?(dvander)
If checks are intended to be diagnostics, we should make sure to add a comment with a reference to the bug #. That would have made it more clear here. Anyway, I'll comment in bug 1088300.
Flags: needinfo?(dvander)
I've commented in bug 1088300 as well.
Flags: needinfo?(gwright)
It's more of a general conversation, rather than particular to that bug.  If the assert/error/whatever is OK *only* in the "we're using the GPU process" scenario, then we should treat it that way.
You need to log in before you can comment on or make changes to this bug.