Limit the amount of device resets we allow with the GPU process

RESOLVED FIXED in Firefox 52

Status

()

Core
Graphics
RESOLVED FIXED
a year ago
a year ago

People

(Reporter: rhunt, Assigned: rhunt)

Tracking

(Blocks: 1 bug)

unspecified
mozilla52
Points:
---

Firefox Tracking Flags

(firefox52 fixed)

Details

Attachments

(1 attachment)

(Assignee)

Description

a year ago
We should try and detect situations where the device resets continuously and disable acceleration/the GPU process.
(Assignee)

Comment 1

a year ago
Milan, is this something that we would want for non-gpu process configurations? The code is slightly cleaner if we don't need to worry about that. But both are viable options.
Flags: needinfo?(milan)
Just the GPU process for now.
Flags: needinfo?(milan)
(Assignee)

Comment 3

a year ago
Created attachment 8809500 [details] [diff] [review]
limit-resets.patch

This introduces two new preferences, gfx.device-reset.limit, gfx.device-reset.threshold-ms, to configure a limit on the amount of device resets we allow in a session before we move to software.

If both prefs are set to -1 then we don't limit. If one of the prefs are set, we limit by that condition. If both of those prefs are set we limit when both conditions are true.

The conditions are resetsHappened > limit and timeSinceLastReset < threshold.

Once the limit has happened we disable the GPU process and kill it and move to software.
Attachment #8809500 - Flags: review?(dvander)
Comment on attachment 8809500 [details] [diff] [review]
limit-resets.patch

Review of attachment 8809500 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/ipc/GPUProcessManager.cpp
@@ +277,5 @@
> +
> +  // If we have both prefs set then it needs to trigger both limits,
> +  // otherwise we only test the pref that is set or none
> +  if (hasTimeLimit && hasCountLimit) {
> +    return triggeredTime && triggeredCount;

Maybe doing an || here is better?

::: gfx/ipc/GPUProcessManager.h
@@ +217,5 @@
>  
>    nsTArray<RefPtr<RemoteCompositorSession>> mRemoteSessions;
>    nsTArray<GPUProcessListener*> mListeners;
>  
> +  uint64_t mDeviceResetCount;

I hope we don't get 2^64-1 device resets in one session.

::: gfx/thebes/gfxPrefs.h
@@ +372,5 @@
>    // The zero default here should match QCMS_INTENT_DEFAULT from qcms.h
>    DECL_GFX_PREF(Live, "gfx.color_management.rendering_intent", CMSRenderingIntent, int32_t, 0);
>  
> +  DECL_GFX_PREF(Once, "gfx.device-reset.limit",                DeviceResetLimitCount, int32_t, -1);
> +  DECL_GFX_PREF(Once, "gfx.device-reset.threshold-ms",         DeviceResetThresholdMilliseconds, int32_t, -1);

Can we pick some semi-random values here to start with? I offer: 10 for limit, but keep no time threshold for now.
Attachment #8809500 - Flags: review?(dvander) → review+

Comment 5

a year ago
Pushed by rhunt@eqrion.net:
https://hg.mozilla.org/integration/mozilla-inbound/rev/473160a4dbaf
Limit the amount of device resets we allow with the gpu process. r=dvander

Comment 6

a year ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/473160a4dbaf
Status: NEW → RESOLVED
Last Resolved: a year ago
status-firefox52: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
Comment on attachment 8809500 [details] [diff] [review]
limit-resets.patch

Review of attachment 8809500 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/ipc/GPUProcessManager.cpp
@@ +269,5 @@
> +  auto timeLimit = gfxPrefs::DeviceResetThresholdMilliseconds();
> +  auto countLimit = gfxPrefs::DeviceResetLimitCount();
> +
> +  bool hasTimeLimit = timeLimit != -1;
> +  bool hasCountLimit = countLimit != -1;

I'd probably just be safe and do timeLimit >= 0, countLimit >= 0, rather than comparing to -1.  Then you don't have to worry about somebody specifying -200 and uint32_t cast wrapping around.
(Assignee)

Comment 8

a year ago
(In reply to Milan Sreckovic [:milan] from comment #7)
> Comment on attachment 8809500 [details] [diff] [review]
> limit-resets.patch
> 
> Review of attachment 8809500 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: gfx/ipc/GPUProcessManager.cpp
> @@ +269,5 @@
> > +  auto timeLimit = gfxPrefs::DeviceResetThresholdMilliseconds();
> > +  auto countLimit = gfxPrefs::DeviceResetLimitCount();
> > +
> > +  bool hasTimeLimit = timeLimit != -1;
> > +  bool hasCountLimit = countLimit != -1;
> 
> I'd probably just be safe and do timeLimit >= 0, countLimit >= 0, rather
> than comparing to -1.  Then you don't have to worry about somebody
> specifying -200 and uint32_t cast wrapping around.

Yeah that's better. I filed bug 1317512 for this.
Just a thought, it might be better to make the reset limit proportional to the time the process has been running.

Otherwise you punish long running sessions unfairly (I frequently run firefox continuously for weeks).
(Assignee)

Comment 10

a year ago
That'd be very doable. We could make the reset limit pref be resets/hr or resets/minute or whatever makes sense for the final limits that are chosen.

It could also be added to the existing prefs. What do you think Milan?
Flags: needinfo?(milan)
Would the long sessions be taken care of by a non-zero time limit preference?  It won't matter how many resets, if they're more than, a few seconds apart, for instance.
Flags: needinfo?(milan)
You need to log in before you can comment on or make changes to this bug.