Closed Bug 1316476 Opened 8 years ago Closed 8 years ago

Limit the amount of device resets we allow with the GPU process

Categories

(Core :: Graphics, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla52
Tracking Status
firefox52 --- fixed

People

(Reporter: rhunt, Assigned: rhunt)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

We should try and detect situations where the device resets continuously and disable acceleration/the GPU process.
Milan, is this something that we would want for non-gpu process configurations? The code is slightly cleaner if we don't need to worry about that. But both are viable options.
Flags: needinfo?(milan)
Just the GPU process for now.
Flags: needinfo?(milan)
This introduces two new preferences, gfx.device-reset.limit, gfx.device-reset.threshold-ms, to configure a limit on the amount of device resets we allow in a session before we move to software.

If both prefs are set to -1 then we don't limit. If one of the prefs are set, we limit by that condition. If both of those prefs are set we limit when both conditions are true.

The conditions are resetsHappened > limit and timeSinceLastReset < threshold.

Once the limit has happened we disable the GPU process and kill it and move to software.
Attachment #8809500 - Flags: review?(dvander)
Comment on attachment 8809500 [details] [diff] [review]
limit-resets.patch

Review of attachment 8809500 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/ipc/GPUProcessManager.cpp
@@ +277,5 @@
> +
> +  // If we have both prefs set then it needs to trigger both limits,
> +  // otherwise we only test the pref that is set or none
> +  if (hasTimeLimit && hasCountLimit) {
> +    return triggeredTime && triggeredCount;

Maybe doing an || here is better?

::: gfx/ipc/GPUProcessManager.h
@@ +217,5 @@
>  
>    nsTArray<RefPtr<RemoteCompositorSession>> mRemoteSessions;
>    nsTArray<GPUProcessListener*> mListeners;
>  
> +  uint64_t mDeviceResetCount;

I hope we don't get 2^64-1 device resets in one session.

::: gfx/thebes/gfxPrefs.h
@@ +372,5 @@
>    // The zero default here should match QCMS_INTENT_DEFAULT from qcms.h
>    DECL_GFX_PREF(Live, "gfx.color_management.rendering_intent", CMSRenderingIntent, int32_t, 0);
>  
> +  DECL_GFX_PREF(Once, "gfx.device-reset.limit",                DeviceResetLimitCount, int32_t, -1);
> +  DECL_GFX_PREF(Once, "gfx.device-reset.threshold-ms",         DeviceResetThresholdMilliseconds, int32_t, -1);

Can we pick some semi-random values here to start with? I offer: 10 for limit, but keep no time threshold for now.
Attachment #8809500 - Flags: review?(dvander) → review+
Pushed by rhunt@eqrion.net:
https://hg.mozilla.org/integration/mozilla-inbound/rev/473160a4dbaf
Limit the amount of device resets we allow with the gpu process. r=dvander
https://hg.mozilla.org/mozilla-central/rev/473160a4dbaf
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
Comment on attachment 8809500 [details] [diff] [review]
limit-resets.patch

Review of attachment 8809500 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/ipc/GPUProcessManager.cpp
@@ +269,5 @@
> +  auto timeLimit = gfxPrefs::DeviceResetThresholdMilliseconds();
> +  auto countLimit = gfxPrefs::DeviceResetLimitCount();
> +
> +  bool hasTimeLimit = timeLimit != -1;
> +  bool hasCountLimit = countLimit != -1;

I'd probably just be safe and do timeLimit >= 0, countLimit >= 0, rather than comparing to -1.  Then you don't have to worry about somebody specifying -200 and uint32_t cast wrapping around.
(In reply to Milan Sreckovic [:milan] from comment #7)
> Comment on attachment 8809500 [details] [diff] [review]
> limit-resets.patch
> 
> Review of attachment 8809500 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: gfx/ipc/GPUProcessManager.cpp
> @@ +269,5 @@
> > +  auto timeLimit = gfxPrefs::DeviceResetThresholdMilliseconds();
> > +  auto countLimit = gfxPrefs::DeviceResetLimitCount();
> > +
> > +  bool hasTimeLimit = timeLimit != -1;
> > +  bool hasCountLimit = countLimit != -1;
> 
> I'd probably just be safe and do timeLimit >= 0, countLimit >= 0, rather
> than comparing to -1.  Then you don't have to worry about somebody
> specifying -200 and uint32_t cast wrapping around.

Yeah that's better. I filed bug 1317512 for this.
Just a thought, it might be better to make the reset limit proportional to the time the process has been running.

Otherwise you punish long running sessions unfairly (I frequently run firefox continuously for weeks).
That'd be very doable. We could make the reset limit pref be resets/hr or resets/minute or whatever makes sense for the final limits that are chosen.

It could also be added to the existing prefs. What do you think Milan?
Flags: needinfo?(milan)
Would the long sessions be taken care of by a non-zero time limit preference?  It won't matter how many resets, if they're more than, a few seconds apart, for instance.
Flags: needinfo?(milan)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: