Closed Bug 628129 Opened 13 years ago Closed 1 year ago

Make GPU blacklisting logic aware of dual GPU systems

Categories

(Core :: Graphics, defect)

x86
Windows 7
defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
blocking2.0 --- -

People

(Reporter: scoobidiver, Unassigned)

References

(Depends on 1 open bug, Blocks 2 open bugs)

Details

Attachments

(3 files)

Dual video cards (ATI/Intel or NVIDIA/Intel pair) become more and more common.
The computer manufacturer builds the graphic driver using the one from ATI (or NVIDIA) and the one from Intel. But, it shows only the ATI (or NVIDIA) graphic driver version. It can be confusing when the primary video card is Intel.
See:
https://bugzilla.mozilla.org/show_bug.cgi?id=599661#c0 (ATI as primary graphics)
https://bugzilla.mozilla.org/show_bug.cgi?id=599661#c4 (Intel as primary graphics)

Bug 590373 introduced a check between the driver and igd10umd32.dll versions that blocks the HW acceleration when Intel is the primary video card at Firefox launch.

To enable the HW acceleration if there are dual video cards:
* detect if there are several video cards.
* show them all in about:support and crash reports app notes.
* use the igd10umd32.dll version (on Windows Vista, 7) to determine the driver version of Intel card instead of DriverVersion read from the registry (patch of bug 590373 will become useless)
* enable the D2D/D3D feature only if all video cards are not blocklisted for this feature.
The check introduced in bug 590373 only takes place when the VendorID is intel. So there only is a check at all when the primary GPU is Intel.

In this case, the check checks that the Intel DriverVersion from the registry is equal to the igd10umd{32,64}.dll version, i.e. it's a basic check that the Intel driver is correctly installed. Again, since that check is only performed when the VendorID is intel, I don't think that there's any risk that it would get confused by dual-GPU setups (comparing ATI driver version to Intel DLL version).

So I think that HW acceleration is still enabled with dual video cards.

Would you agree with renaming this bug from "Enable HW acceleration with dual video cards" to "Make GPU blacklisting logic aware of dual GPU systems".

I am a bit in the fuzzy still about how dual GPU systems behave. Can it be that the GPU changes under our feet at any time?
> The check introduced in bug 590373 only takes place when the VendorID is intel.
> So there only is a check at all when the primary GPU is Intel.
The "primary" term is probably inadequate, I should use "used".
If you look at https://bugzilla.mozilla.org/show_bug.cgi?id=599661#c4, the VendorId is Intel and the DriverVersion from the registry is equal to 8.641.1.1000, which is not at all an Intel driver version numbering, because it is the ATI one.
So the check introduced by bug 590373 compares 8.641.1.1000 with 8.15.10.2202, which is not equal and, as a consequence, blocks the D2D/D3D feature.
Summary: Enable HW acceleration with dual video cards → Make GPU blacklisting logic aware of dual GPU systems
Blocks: 601079
I consider the D2D blocklisting for all dual GPUs as blocking.
blocking2.0: --- → ?
Do we have data showing that we have many graphics driver crashes on dual-GPU systems? I would only block 2.0 if that is the case.
> Do we have data showing that we have many graphics driver crashes on dual-GPU
> systems?
From the fixing of bug 590373, all dual GPUs are blocklisted for D2D preventing around 10% of all users to use HW acceleration although they have compatible GPUs and required graphics driver versions.
I am asking to D2D-unblock dual GPUs with required graphics driver versions.
(In reply to comment #5)
> > Do we have data showing that we have many graphics driver crashes on dual-GPU
> > systems?
> From the fixing of bug 590373, all dual GPUs are blocklisted for D2D preventing
> around 10% of all users to use HW acceleration although they have compatible
> GPUs and required graphics driver versions.
> I am asking to D2D-unblock dual GPUs with required graphics driver versions.

Sorry, I forgot about the issue of _potentially_ mistakenly blocking users of dual GPU systems.

Do we have any evidence that we are really blocking _all_ of them? My understanding is that they were only potentially blocked, that is, I thought that we were at worst blocking a minority of them.

Indeed,
http://mxr.mozilla.org/mozilla-central/source/widget/src/windows/GfxInfo.cpp#360

this code basically says:

"if the driver vendor in windows registry is Intel,
 and the driver version in windows registry is different from
     the Intel DLL version,
 then block D3D10/D2D features."

So the only ways that this can block users, are:
 1. if the windows registry is broken, in that it gives the Intel vendor ID and the non-Intel DriverVersion.
 2. if the Intel driver is incorrectly installed, in that the Intel DriverVersion from the registry is different from the Intel DLL version.

In both cases, the user's system is broken, so I don't think that we do a wrong thing by blocking it?

Am I missing something? Is there a way for correctly installed dual-GPU setups to get incorrectly blacklisted here?
(In reply to comment #2)
> > The check introduced in bug 590373 only takes place when the VendorID is intel.
> > So there only is a check at all when the primary GPU is Intel.
> The "primary" term is probably inadequate, I should use "used".
> If you look at https://bugzilla.mozilla.org/show_bug.cgi?id=599661#c4, the
> VendorId is Intel and the DriverVersion from the registry is equal to
> 8.641.1.1000, which is not at all an Intel driver version numbering, because it
> is the ATI one.

Oh! ok. I should have read that again.

OK, that's what I called case 1. in comment 6.

Is that legitimate? In this case, the Windows registry is self-contradictory. It reports the Intel VendorId and the non-Intel DriverVersion. Is that really how dual GPU systems are supposed to behave, or is that just a bug? I mean, it really looks like a bug, and one that compromise basic sanity of the windows registry. Should we really go out of our way to still enable hw accel on such a system? Do we have data showing that this happens on a lot of dual GPU systems?
> Is that really how dual GPU systems are supposed to behave, or is that just a
> bug?
I think so. There is only one driver version and the vendorID changed according to the used GPU.

> Do we have data showing that this happens on a lot of dual GPU systems?
Now we have the driver version in crash reports, I think it is easy to check in 4.0b10 crash stats the percentage of Intel GPUs that have a driver version pattern 8.abc.y.z (ATI one) or 8.17.ab.cd (NVIDIA one) wrt to all Intel GPUs on Windows Vista/7.
Sad, but not a blocker.
blocking2.0: ? → -
Blocks: 596144
Using the Intel attached file in bug 623317, I found that there is 0.4% (resp. 0.1%) of crash reports that has an ATI (resp. NVIDIA) driver version with an Intel GPU on Windows Vista/7.
Blocks: 635464
Please kindly consider raising the status of this bug.

I have a new Dell Alienware M11x-R3 - popular, not expensive Core i7 dual-gpu gaming computer. Allows me to switch to Intel gpu when on battery. Nice for longer battery life. My guess is that this type of configuration will become quite popular this year.

No matter what settings I have in about:config I cannot get Nvidia gpu to load on FF 4.0.1 or 5 beta.

Chrome uses Nvidia without issue or crashes. So I am doing all my WebGL on Chrome. But would prefer using FF...
I agree now that we should prioritize this bug. Bug 667437 shows that failing to detect dual GPUs makes us hit bug 635044 i.e. we fail to disable layers on NVIDIA NVS 3100M as we should.
@ Benoit

Thank you. Yes, I have been to that page. I just did not want to bore you with all my failures.

I have tried most every permutation of settings that appear when you search on "webgl" in about:config - including: webgl.force-enabled=true

I have also in:

Nvidia Control Panel > Manage 3D Settings > set Nvidia as the preferred graphics processor for FF.

Here is the interesting new thing:

The last time I diddled with all the webgl settings (over a week ago) all I would get is a silent fail. There was no "your computer does not support webGL" message nor did the display show anything.

The new thing (with 5.0 beta) is that when I set webgl.force-enabled=true FF really does crash.

Here is what about:support says:

Adapter Description Intel(R) HD Graphics Family
Vendor ID8086
Device ID0116
Adapter RAM Unknown
Adapter Drivers igdumd64 igd10umd64 igd10umd64 igdumdx32 igd10umd32 igd10umd32Driver Version 8.15.10.2342
Driver Date3-25-2011
Direct2D Enabled true
DirectWrite Enabled true (6.1.7601.17563, font cache n/a)
WebGL Renderer Google Inc. -- ANGLE -- OpenGL ES 2.0 (ANGLE 0.0.0.611)
GPU Accelerated Windows 1/1 Direct3D 10

So FF still seems to be using the Intel graphics processor. 

Another thing I notice in about:config is this:

gfx.blacklist.webgl.angle is set to 3.

**

I hope this helps. Feel free to ask for more data.

Please forgive me for writing this message using Chrome - because of all the crashes verifying what I am saying in this message...
Interesting; I have no idea how to force Firefox to use the Optimus cards instead of the Intel one, but we should look into that. Your crash is probably in the Intel OpenGL driver (a crash link would allow to confirm that). The reason why you have gfx.blacklist.webgl.angle is because ANGLE is blacklisted on Optimus and some time in the past your Optimus card was used. Caching this blacklisting decision only makes the situation more complicated, bug 653102 is about fixing that.
@ Benoit

>>  I have no idea how to force Firefox to use the Optimus cards instead of the Intel one, but we should look into that.

Yes, please! Yes.

>> a crash link would allow to confirm that

I'm too new here to know if there is a preferred way of doing this. Anyway: here is a crash log pasted into a Google Doc.

http://goo.gl/eXAET
Theo, if you go to about:crashes in your browser, you will see a bunch of links with names like bp-1f30c477-3ba3-4b38-9829-26cdb2110628. If you just copy those names into Bugzilla, it's smart enough to link them to the crash report Firefox submitted for you.
This patch checks the Windows registry for the presence of dual GPUs. If dual GPUs are present:
-D3D9 is used to determine which GPU is active
-the blacklisting logic is applied to the active GPU
-both the active and inactive GPUs are listed in about:support and in crash reports
@ Ali

This is cool. 

If I am reading your patch correctly, your code is the first step in solving the problem.

It identifies dual GPU systems, and blacklists those with issues, but it does not change the way that Firefox passes data in and out of the GPUs. 

Am I right in thinking this is just the first [and necessary] step in solving the problem? 

In any case, thank you for getting the ball rolling.
@ Theo

The code determines which GPU Firefox is really using, and then proceeds accordingly (that is, enables/disables features in the same way we would if the active GPU was the only GPU).

This is indeed a first step, in the sense that we still need to test this logic against a large number of dual GPU systems to see if there are other quirks that need to be taken into account.
Attachment #544601 - Flags: review?(jmuizelaar)
Comment on attachment 544601 [details] [diff] [review]
handle dual GPUs on Windows

Review of attachment 544601 [details] [diff] [review]:
-----------------------------------------------------------------

::: widget/public/nsIGfxInfo.idl
@@ +67,4 @@
>    
>    readonly attribute unsigned long adapterDeviceID;
> +  readonly attribute unsigned long adapterDeviceID2;
> +

We should eventually switch to a list of devices instead of numbering them explicitly. You should add a comment to say this.

::: widget/src/windows/GfxInfo.cpp
@@ +215,5 @@
> +  temp = aFirst;
> +  aFirst = aSecond;
> +  aSecond = temp;
> +}
> +

You should be able to use algorithm::swap here.

@@ +253,5 @@
> +    Swap(mDeviceID, mDeviceID2);
> +    Swap(mAdapterVendorID, mAdapterVendorID2);
> +    Swap(mAdapterDeviceID, mAdapterDeviceID2);
> +  } 
> +}

I worry that this information might not match what we get if we use D3D10 instead of D3D9. Are you sure using D3D9 will always give us the right information?

@@ +466,5 @@
>                  mDriverDate = value;
> +              RegCloseKey(key); 
> +
> +              //Check for second adapter.
> +              if (driverKey[driverKey.Length()-1] == '0') {

What if driverKey.Length() ends up being 0?

@@ +470,5 @@
> +              if (driverKey[driverKey.Length()-1] == '0') {
> +                driverKey.SetCharAt('1', driverKey.Length()-1);
> +              } else {
> +                driverKey.SetCharAt('0', driverKey.Length()-1);
> +              }

Can you add a higher level comment about what's happening here? Are you trying to swap a '0' and '1'?
Attachment #544601 - Flags: review?(jmuizelaar) → review-
> I worry that this information might not match what we get if we use D3D10
> instead of D3D9. Are you sure using D3D9 will always give us the right
> information?

No, I'm not sure that D3D9 will always be right, but I'm not also not sure if there's a good way to resolve this question. Is D3D10 likely to be a better starting point for this?
Assignee: nobody → ajuma
Depends on: 591057
Given the uncertainty we have about the information we get from D3D9 (e.g. can we be certain that D3D9 and D3D10 will always get the same GPU), I've created a revised version of Attachment 544601 [details] [diff] that doesn't change the blacklisting logic for now -- see Attachment 549122 [details] [diff], bug 591057.
Depends on: 675962
Depends on: 679110
I downloaded the source for Firefox 8.0b6 and was looking at the current dual GPU logic detection in GfxInfo.cpp and it appears to be incorrect for my system.

I was attempting to integrate the logic in to an older version of the GfxInfo.cpp file of the file but debugging through shows this to be an incorrect assumption:-

>>>
 GfxInfo::Init()
...
              // Check for second adapter:
              //
              // A second adapter will have the same driver key as the first adapter except for 
              // the last character, where '1' will be swapped for '0' or vice-versa.
              // We know driverKey.Length() > 0 since driverKeyPre is a prefix of driverKey.
              if (driverKey[driverKey.Length()-1] == '0') {
                driverKey.SetCharAt('1', driverKey.Length()-1);
              } else {
                driverKey.SetCharAt('0', driverKey.Length()-1);
              }

<<<

My Windows7 system has had various cards installed in it with various updates of drivers. At the moment it just has a single ATI 6800 in it.

The code correctly identifies this as the primary adaptor. However the reg key for it is HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Class\{4D36E968-E325-11CE-BFC1-08002BE10318}\0003

Under 
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Class\{4D36E968-E325-11CE-BFC1-08002BE10318}

there are two other keys \0000 for an ATI HD 5800 and \0001 for an HD 5700, neither of which are in installed.

So the code picks \0000 and thinks it has two ATI cards in.


I downloaded FF Aurora 9.0a2 (2011-11-02) and about:support also incorrectly list two cards:
>>>
Graphics

        Adapter Description
        AMD Radeon HD 6800 Series

        Vendor ID
        1002

        Device ID
        6739

        Adapter RAM
        1024

        Adapter Drivers
        aticfx64 aticfx64 aticfx64 aticfx32 aticfx32 aticfx32 atiumd64 atidxx64 atidxx64 atiumdag atidxx32 atidxx32 atiumdva atiumd6a atitmm64

        Driver Version
        8.892.0.0

        Driver Date
        9-8-2011

        Adapter Description (GPU #2)
        ATI Radeon HD 5800 Series

        Vendor ID (GPU #2)
        1002

        Device ID (GPU #2)
        6899

        Adapter RAM (GPU #2)
        1024

        Adapter Drivers (GPU #2)
        aticfx64 aticfx64 aticfx64 aticfx32 aticfx32 aticfx32 atiumd64 atidxx64 atidxx64 atiumdag atidxx32 atidxx32 atiumdva atiumd6a atitmm64

        Driver Version (GPU #2)
        8.892.0.0

        Driver Date (GPU #2)
        9-8-2011

        Direct2D Enabled
        true

        DirectWrite Enabled
        true (6.1.7601.17563)

        ClearType Parameters
        ClearType parameters not found

        WebGL Renderer
        Google Inc. -- ANGLE (AMD Radeon HD 6800 Series) -- OpenGL ES 2.0 (ANGLE 0.0.0.740)

        GPU Accelerated Windows
        1/1 Direct3D 10
<<<
(In reply to michaelbraithwaite from comment #25)
> I downloaded the source for Firefox 8.0b6 and was looking at the current
> dual GPU logic detection in GfxInfo.cpp and it appears to be incorrect for
> my system.

Thanks for the report. As you demonstrate, the approach of swapping '0' for '1' in the driver key is indeed flawed. Bug 679110, Attachment 571775 [details] [diff] replaces this approach with enumerating devices in the display adapter device interface class.
Also integrated single GPU intel card (2a02&2a03) looks like dual GPU on about:support.
(In reply to gader from comment #28)
> Also integrated single GPU intel card (2a02&2a03) looks like dual GPU on
> about:support.
Not in Aurora and Nightly.
Depends on: 724874
Assignee: ajuma.bugzilla → nobody
Blocks: 839820
Blocks: 1056116
Attached image 300zoombug.png
I encountered a font rendering problem on both Firefox 41.0b9 and Nightly 43.0a1 (2015-09-12) on this webpage - http://www.xda-developers.com/hello-moto-what-are-you-doing-an-opinion-on-motorolas-transformations/  

Apparently the issue seems to caused because DirectWrite isn't enabled on both these Firefox versions.
about:support Graphics Section says:
Direct2D Enabled	Blocked for your graphics driver version.
DirectWrite Enabled	false (6.2.9200.17461)


It was fixed by enabling these two option in about:config on Firefox 41.0b9
gfx.direct2d.force-enabled - true
gfx.font_rendering.directwrite.enabled - true (Nightly 43 doesn't have this option but works by enabling the other)

about:support Graphics Section now says:
Direct2D Enabled	true
DirectWrite Enabled	true (6.2.9200.17461)


Dual GPU Details:
Adapter Description	        Intel(R) HD Graphics Family
Adapter Description (GPU #2)	NVIDIA GeForce GT 525M 
Driver Date	                4-10-2011
Driver Date (GPU #2)	        7-22-2015
Driver Version	                8.15.10.2361
Driver Version (GPU #2)	        10.18.13.5362

Was lucky to have stumbled upon this page where i got the solution to the problem -https://wiki.mozilla.org/Blocklisting/Blocked_Graphics_Drivers

I feel DirectWrite needs to be enabled for Dual-GPU systems because almost all new laptops come on a Dual GPU model.  
Lots of users could potentially miss out on good font rendering and be served with inferior quality font rendering.

Until i visited the xda page I had no clue that this was happening. But after figuring out this issue with the about:config settings, i disabled them and when back to my commonly used site and was surprised this wasn't visible because most webpages used small fonts on their pages. Zooming into the page clearly reveals the flaws on popular sites like reddit.com etc.  

Attachment is zoomed in to 300% to show the flaws. Happens on standard zoom as well. Left is Firefox.
Attached image nobug300.png
This is after force-enabling the two settings. Both the browser look the same and with much improved font rendering all round.
Severity: normal → S3

The severity field for this bug is relatively low, S3. However, the bug has 12 votes.
:bhood, could you consider increasing the bug severity?

For more information, please visit auto_nag documentation.

Flags: needinfo?(bhood)

The last needinfo from me was triggered in error by recent activity on the bug. I'm clearing the needinfo since this is a very old bug and I don't know if it's still relevant.

Flags: needinfo?(bhood)

Based on comment 31, the OP appears to have gotten a satisfactory workaround to the issue. Given this, and that their account is now disabled so I cannot ask for follow-up, I'm closing this.

Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: