Closed Bug 702504 Opened 8 years ago Closed 7 years ago

Win7 boxes are not running with large enough screen to run accelerated layer reftests

Categories

(Core :: Graphics, defect, critical)

x86
Windows 7
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla22

People

(Reporter: cmtalbert, Assigned: mattwoodrow)

References

Details

Attachments

(3 files, 2 obsolete files)

Bas noticed that we are not running windows 7 with accelerated layers during our reftests.  We should have been, and we once were, but something has changed to lower the screen resolution such that we are now running with too low of a resolution to run these tests.

I have a hunch that this may have happened when we got the new mac mini's in and we re-imaged the old mac minis to be windows slaves.  We may not have set the resolution high enough on the newly imaged windows minis.

At any rate, we need to up the resolution to at least 800x 1000.  

This is pretty critical because we're missing test coverage right now, and Bas is saying that we're failing accelerated tests, so once you make this change, expect things to go orange until those tests are fixed again.  That should be the behavior you expect.  CC'ing Bas so we can work with him and perhaps co-land this resolution change with a patch to fix the broken tests.
Sorry for the spam, meant to include a link to a test log.

Here it is: https://tbpl.mozilla.org/php/getParsedLog.php?id=7388046&tree=Firefox&full=1

Search for USE_WIDGET_LAYERS, and you'll see the error.
We haven't repurposed the rev 3 machine yet if that's what you are referring to
This is run on every win7 talos slave before buildbot is started:
# from Desktop\startTalos.bat [1]
cd C:\
nircmd.exe setdisplay 1280 1024 32
which is good enough.

I have VNCed into a slave and the resolution is 1024x768.

If I run the nircmd.exe command myself I don't see any screen resolution being changed. Not sure if it is an artifact of VNC or if we're really broken.

The information from the log also is not clear to me:
REFTEST INFO | drawWindow flags = DRAWWINDOW_DRAW_CARET | DRAWWINDOW_DRAW_VIEW; window size = 817,788; test browser size = 800,1000

None of the two values (817 & 788) are below 768 which says that the machine has to have a higher screen resolution than 1024x768 (not sure if I am reading this correctly)(perhaps the window is drawn beyond the borders of the screen).

I have checked 6 different logs and all are disabled so this is a verified regression.

Let's think of an action plan:
1) can we at the beginning of the test run set the screen resolution? (why is nircmd.exe not working?) any other tool that could help us switch the screen resolution?
2) releng to figure out why the resolution is not set properly
3) can you guys please add a line to take a snapshot? I believe we should be taking a snapshot at the beginning of a run and at the end for situations like this (I would like to have it across the border)
4) once we go through the bump, could we a test that would turn the job orange if USE_WIDGET is disabled? I am afraid this could happen again with slaves being imaged incorrectly and/or when we switch to newer hardware

Makes sense? Am I missing anything?


[1] http://hg.mozilla.org/build/puppet-manifests/raw-file/default/modules/buildslave/files/startTalos-w7.bat
Do we have any prove to believe that this ever worked? The only available resolutions are:
1280x720
1024x768
800x600

Could this have been messed up with updating the graphic drivers on Win7 few months ago?

WinXP has an extensive set of available screen resolutions (I just wanted to compare what the state was there for background knowledge).
Assignee: nobody → armenzg
It seems that bug 624044 regressed us. This means since January 2011.

I uninstalled the 260.99 driver from bug 624044 and I have now plentiful of screen resolutions.

Running nircmd.exe setdisplay 1280 1024 32 changes the resolution immediately.

dxdiag says that I am now running 8.15.11.8684 (10/23/2009).

After installing 195.62 (which was the original driver before 260.99) I get several screen resolutions (expected).

After installing 260.99 [1] (which is what all win7 slaves have) we get 3 screen resolutions.

I installed 285.62 [2] which is the latest driver and I also get 3 screen resolutions.

I installed 257.21 [3] (2010-06-15) and it is the first driver that I installed higher than 195.62 and lower than 260.99 that has more than 3 screen resolutions.

Now, we need to figure out which version we want from this list: [4]
Name 	                        Version   Release Date
GeForce 285.79 Driver BETA 	285.79 	  November 10, 2011
GeForce 285.62 Driver WHQL 	285.62 	  October 24, 2011
GeForce 285.38 Driver BETA 	285.38 	  September 26, 2011
GeForce 285.27 Driver BETA 	285.27 	  September 13, 2011
GeForce 280.26 Driver WHQL 	280.26 	  August 9, 2011
GeForce 280.19 Driver BETA 	280.19 	  July 28, 2011
GeForce 275.50 Driver BETA 	275.50 	  June 20, 2011
GeForce 275.33 Driver WHQL 	275.33 	  June 1, 2011
GeForce Driver v275.27 BETA 	275.27 	  May 17, 2011
GeForce/ION Driver v270.61 WHQL 270.61 	  April 18, 2011
GeForce/ION Driver v270.51 BETA 270.51 	  March 30, 2011
GeForce/ION Driver v267.24 BETA 267.24 	  March 1, 2011
GeForce/ION Driver v266.58 WHQL 266.58 	  January 18, 2011
GeForce/ION Release 265 BETA 	266.35 	  January 4, 2011
GeForce/ION Release 260 WHQL 	260.99 	  October 25, 2010
GeForce/ION Release 260 WHQL 	260.89 	  October 18, 2010
GeForce/ION Release 256 WHQL 	258.96 	  July 19, 2010
GeForce/ION Release 256 WHQL 	258.96 	  July 19, 2010
GeForce/ION Release 256 BETA 	258.69 	  June 29, 2010
GeForce/ION Release 256 WHQL 	257.21 	  June 15, 2010

We can't use those that return in 3 screen resolutions (case of 260.99 and 285.62).

NOTE: For deployment we have to make sure that we un-check the automatic nvidia update system.

[1] http://dev-stage01.build.mozilla.org/pub/mozilla.org/mozilla/libraries/win32/260.99_desktop_win7_winvista_32bit_english_whql.exe
[2] http://dev-stage01.build.mozilla.org/pub/mozilla.org/mozilla/libraries/win32/285.62-desktop-win7-winvista-32bit-english-whql.exe
[3] http://dev-stage01.build.mozilla.org/pub/mozilla.org/mozilla/libraries/win32/257.21_desktop_win7_winvista_32bit_english_whql.exe
[4] http://www.nvidia.com/Download/Find.aspx?lang=en-us
I think I was not explicit/clear on my previous comment.

ctalbert, bas: Is the version 257.21 good to have on the machines? or would you want me to be as close as possible to the latest driver?

I assume 257.21 (Jun 15, 2010) is quite close to 260.99 (October 25, 2010) to make it OK to go back 4 months back.

(In reply to Armen Zambrano G. [:armenzg] - Release Engineer from comment #5)
> 
> I installed 257.21 [3] (2010-06-15) and it is the first driver that I
> installed higher than 195.62 and lower than 260.99 that has more than 3
> screen resolutions.
> 
> Now, we need to figure out which version we want from this list: [4]
From IRC bjacob says that anything higher or equal than 257.21 should work. If we want later on to upgrade the version we can have a separate bug but we should now get out of the hole.

Sounds good?
If you feel it would be useful and have the time to narrow the regression range, laptopvideo2go.com keeps an extensive archive of all GeForce drivers (released by OEM or official).

I've had more luck reaching the download links through their message board than through the Driver section on the main site.
http://forums.laptopvideo2go.com/forum/163-25x-series-geforce-driver-releases/
http://forums.laptopvideo2go.com/forum/165-26x-series-geforce-driver-releases/

If this is a long standing regression in Nvidia's drivers, have they been contacted about the problem?
Depends on: 705854
It seems we have two perma-oranges with this version of the drivers.

I will try to narrow down which of the versions of comment 5 can clear this out.

bjacob, what command can I run to just try to test test_webgl_conformance_test_suite.html?

bas, what command can I run to just try to test gfx/test_acceleration.html?


* mochitests-1/5
46072 ERROR TEST-UNEXPECTED-FAIL | /tests/content/canvas/test/webgl/test_webgl_conformance_test_suite.html | Can't create a WebGL context
WebGL mochitest failed: Can't create a WebGL context

* mochitests-4/5
134 INFO TEST-START | /tests/gfx/test_acceleration.html
135 ERROR TEST-UNEXPECTED-FAIL | /tests/gfx/test_acceleration.html | Acceleration enabled on Windows XP or newer - didn't expect 0, but got it
136 ERROR TEST-UNEXPECTED-FAIL | /tests/gfx/test_acceleration.html | Direct2D enabled on Windows Vista or newer
137 ERROR TEST-UNEXPECTED-FAIL | /tests/gfx/test_acceleration.html | DirectWrite enabled on Windows Vista or newer
138 INFO TEST-END | /tests/gfx/test_acceleration.html | finished in 115ms
Status: NEW → ASSIGNED
Priority: -- → P2
Armen: given the issues in bug 704010, are you going to be looking at the drivers on XP too?
If we figure out a way to reproduce it I can tackle it as well.
(In reply to Armen Zambrano G. [:armenzg] - Release Engineer from comment #9)
> It seems we have two perma-oranges with this version of the drivers.
> 
> I will try to narrow down which of the versions of comment 5 can clear this
> out.
> 
> bjacob, what command can I run to just try to test
> test_webgl_conformance_test_suite.html?

TEST_PATH=content/canvas/test/webgl/test_webgl_conformance_test_suite.html make mochitest-plain

The error you're getting really means that WebGL isn't working at all. Is any hardware acceleration working on this machine? Try any 3d app, or just the Windows Aero Glass interface. If it doesn't work, try dxdiag or some other diagnostic tool.
I don't have a solution forward.

Updating to a newer graphic driver messes up the screen resolution because we're running Windows with BootCamp.
http://nvidia.custhelp.com/app/answers/detail/a_id/2432/kw/screen%20resolution/session/L3RpbWUvMTMyMzE5Mjc2My9zaWQvWndPTGlXS2s%3D
http://reviews.cnet.com/8301-13727_7-10330324-263.html

The article says to reinstall the drivers from the BootCamp DVD that came for your system.

I was hopeful that it was because I had used the "desktop" version of the drivers instead of the "notebook" version but it was not the case either.

Options:
1) follow up with IT and see if we can figure out how to fix with/through bootcamp
2) wait until we can switch over to post-rev3 HW in Q2/Q3 next year
3) uninstall the 260.99

FTR I installed/uninstalled dozen of times different version of the drivers until I found an article talking about the bootcamp issues.
These were helpful:
https://discussions.apple.com/thread/2507994?start=0&tstart=0
http://support.apple.com/kb/HT4407

I believe what I have done is to completely remove the graphic drivers and even the device.
On reboot, I assume Windows find the removed graphic card and install whatever Windows updates offer (I think).

It updated to 195. which is higher than 186.84 

I am trying to run the Boot Camp Assistant from one of the rev3 machines.
I tried rebooting talos-r3-w7-053 into Mac OS X but either VNC's password is different OR the VNC service is not enabled.
dividehex is helping me in scl1 to make sure I can connect back to it.

I tried to do so from talos-r4-snow-002 but it said it could not download the Windows drivers at that moment.
I tried talos-r3-leopard-002 but 10.5 does not know what Windows 7 is all about :)

I will wait until I am in the office to try it. Perhaps I can try my MacBook Pro 10.6 laptop and make it work there.
dividehex enabled the sharing services but neither I nor dustin nor dividehex figured out the credentials. We'll wait until arr is back from holidays to figure this out.

The other way forward is to find a mac machine install Windows 7 with BootCamp. Once it is installed we can insert a MacOS X install DVD and ask for the bootcamp drivers to be installed. I'm afraid this might not work as it the drivers have to be for a rev3 mini. It is unfortunate that I don't have any rev3 physically at the Toronto office.

We can't download the support drivers online as mentioned on this article:
https://discussions.apple.com/thread/2793568?start=0&tstart=0 (see Mar 24, 2011 8:57 PM (in response to Ramaset))

##################
From your Mac side - click open Applications, then Utilities, then open Boot Camp Assistant, click "Continue" (bottom right). Next select the option that says "I have the OSX installation disk that came with my Mac or I have… etc" and hit "Continue". Next click "Start the Windows installer" and follow the directions to install Windows. Once Windows has been installed, (perhaps you have already installed Windows, then skip all this) Windows will not be able to go online for updates or downloads as the Boot Camp Drivers for Windows need to be supplied (software that will activate Apple hardware). These drivers were supposed to be downloaded fresh but Apple has a slight problem there. What we are doing now to work around that problem is we are going to use older drivers that are present on the OSX DVD that came with your Mac. For Air owners, these (I believe) are on the OSX install stick (not a DVD). SO - WHEN you are in Windows for the first time, put your Snow Leopard OSX DVD into the slot, and the Auto-Run should play and offer to install the drivers software. Do what it says, and you're done. If it fails to AutoRun, then click open the DVD and look inside for the BootCamp part, and click icons until it runs - that's it. Apple Boot Camp Drivers has it's own Updater, so your machine will be obtaining fresher drivers pretty much as you need them (it will ask your permission).
I installed iTunes to be able to have "Apple Software Update" as installing Boot Camp Software Update 3.3 for Windows [1] was not being able to get installed.
The version installed was 3.0.3 so let's see where 3.1 takes us to.

Perhaps I can only go from 3.0 to 3.1 and then to 3.2 and then to 3.3.

Upgrading to 3.1 still keeps me at 195.62 and I can change the screen resolution.
I could upgrade to 3.2 but I am first trying to install the 285.62 version.

[1] http://support.apple.com/kb/DL1443
Depends on: 685879
Depends on: 708361
It sees that we might need to install dongles to all Window 7 slaves to pick up a 1280x1024 screen resolution. The current screen resolution is 1024x768.

In bug 708361 we will install a couple of dongles and go from there.
(In reply to Armen Zambrano G. [:armenzg] - Release Engineer from comment #14)
> I believe what I have done is to completely remove the graphic drivers and
> even the device.
> On reboot, I assume Windows find the removed graphic card and install
> whatever Windows updates offer (I think).
> 
> It updated to 195. which is higher than 186.84 

I don't know if this is still relevant, but I wanted to note two things:

1) On Windows 7, you can disable automatic driver installation from Windows Update by going to Control Panel -> System and Security -> System -> Advanced system settings -> Hardware tab -> Device Installation Settings and selecting "No, let me choose what to do" and "Never install driver software from Windows Update."

2) When installing new Nvidia drivers, it is usually a good idea to run a tool like Driver Sweeper to remove all traces of the old driver before installing the new set. The procedure I have found works best for me is as follows:
a) Uninstall the Nvidia drivers from Programs and Features, but do not reboot.
b) Run Driver Sweeper, and reboot.
c) Wait for Windows to install the stock drivers (if any) and do not reboot.
d) Run Driver Sweeper, and reboot.
e) Install the new Nvidia driver set.
This procedure may be a bit overkill, and steps c) and d) certainly aren't necessary if Windows has no stock drivers to install. Still, it seems to work well.
Depends on: 710214
Depends on: 710233
I got some reftests failing can anyone take care of work with me and fix them?

http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTest/1323935060.1323937392.30220.gz&fulltext=1
REFTEST TEST-UNEXPECTED-FAIL | file:///c:/talos-slave/test/build/reftest/tests/layout/reftests/bugs/635373-1.html | image comparison (==)
REFTEST TEST-UNEXPECTED-FAIL | file:///c:/talos-slave/test/build/reftest/tests/layout/reftests/bugs/635373-2.html | image comparison (==)
REFTEST TEST-UNEXPECTED-FAIL | file:///c:/talos-slave/test/build/reftest/tests/layout/reftests/bugs/635373-3.html | image comparison (==)
Priority: P2 → P3
Depends on: 712630
IT is getting us the dongles for January but we won't be able to do anything until the perma-oranges are fixed in bug 712630.
I need someone from graphics to jump in.
Priority: P3 → P4
Blocks: 685879
No longer depends on: 685879
Blocks: 711575
I just asked on the dependent bug for someone to work with us to get the perma-oranges fixed.
No longer depends on: 712630
gfx seems to have landed a fix.
I will be testing in the next day (as the release permits) that we're good to go.
Then I will coordinate with IT.
This might be done next week. To be discussed and finalized in today's relops meeting.
It will also need to be discussed with jhford who would be buildduty next week.
I can help from EDT time to gracefully shutdown the Windows test masters 45-60 mins ahead of the work.
dividehex says that it should not take longer than an hour to add all the dongles.
We should ask for a 2 hour window downtime and open earlier if we need to.
We just have to re-trigger jobs a lot of jobs as soon as the dongles are attached to an existing job.

On the day prior to the downtime, I will trigger jobs in staging to verify that no new perma-oranges got introduced.

Makes sense?
Flags: needs-treeclosure?
Whiteboard: probably: downtime to be scheduled next week with other IT work at that colo
I have to verify it once more as one of the two slaves that have the dongle did not really have a higher screen resolution.

Sorry for false news :(
Whiteboard: probably: downtime to be scheduled next week with other IT work at that colo → waiting on dependent bug
Flags: needs-treeclosure?
Adjusting dependency.

In the next 1-2 weeks IT will deploy the dongles *but* this will be fixed once the gfx team lands a larger screen resolution to mozilla-central (bug 712630). They will be able to push to try until they can green everything out.
Depends on: 712630
No longer depends on: 710233, 705854, 708361, 710214
No longer blocks: 711575
We're now ready for the gfx team to take over the baton to fix the perma-oranges that get revealed when changing the screen resolution.

Try can be used to clear these out.

Joe is on vacations so I won't assign this to him as we had spoken.
Assignee: armenzg → nobody
Component: Release Engineering → Graphics
Priority: P4 → --
Product: mozilla.org → Core
QA Contact: release → thebes
Hardware: x86_64 → x86
Whiteboard: waiting on dependent bug
Version: other → Trunk
Do you have a link to the log. I can mark them appropriately.
Hi Jeff,
I posted this on the dependent bug. This has all the info you need.
I hope it helps.

(In reply to Armen Zambrano G. [:armenzg] - Release Engineer from comment #119)
> I tested that we can change the screen resolution and mouse position by
> landing code changes:
> https://hg.mozilla.org/try/rev/84d83c001cad
> 
> Screen resolution (current): (1024, 768)
> Changing the screen resolution...
> Screen resolution (new): (1280, 1024)
> Mouse position (current): (640, 512)
> Mouse position (new): (1270, 10)
> program finished with exit code 0
> 
> This also causes oranges to show up:
> https://tbpl.mozilla.org/php/getParsedLog.php?id=13080585&tree=Try&full=1
> https://tbpl.mozilla.org/php/getParsedLog.php?id=13100834&tree=Try&full=1
> 
> I will push bug 702504 to the gfx team to finish up any remaining work.
> 
> Yay!
Status: ASSIGNED → NEW
Attached patch Mark the tests appropriately (obsolete) — Splinter Review
This should take care of the failures.
Attachment #643891 - Flags: review?(bas.schouten)
Attachment #643891 - Flags: review?(bas.schouten) → review+
Did this ever land?
I don't think it landed, no, and it probably has bitrotted since then, but let's just drive this to completion, can we?
Assignee: nobody → jmuizelaar
Can we please make some progress on this? Or set expectations?

We're testing a configuration that does not test hardware acceleration for the platform where we have most users.

I don't want to complain but I was very pressured on the dependent bugs to give us the ability to recover from the previous condition but this is now stalled after all that work.
Attached patch Rebased patch (obsolete) — Splinter Review
(In reply to Armen Zambrano G. [:armenzg] from comment #32)
> I don't want to complain but I was very pressured on the dependent bugs to
> give us the ability to recover from the previous condition but this is now
> stalled after all that work.

That's fair. Some of the pressure has reduced because I fixed bug 772726, so we actually are testing Azure D2D rendering in TBPL reftests now. But this bug is still very important for testing D3D10 compositing itself.
I've pushed the change to change the screen resolution to Try to show Milan what goes orange:
https://tbpl.mozilla.org/?tree=Try&rev=d4942b575e9f
Assignee: jmuizelaar → matt.woodrow
Blocks: 769975
Attached patch Updated patchSplinter Review
Attachment #643891 - Attachment is obsolete: true
Attachment #674301 - Attachment is obsolete: true
Attachment #723295 - Flags: review?(roc)
Comment on attachment 723296 [details] [diff] [review]
Armen's patch to change the screen resolution

Review of attachment 723296 [details] [diff] [review]:
-----------------------------------------------------------------

this is enough pixels, I would like to ensure that our refest window has all of those pixels to itself, which I believe it will.
Attachment #723296 - Flags: review?(jmaher) → review+
Whiteboard: [leave open]
Now that we finally have this enabled on all platforms, we should make it a test failure if anything disables it.
Attachment #723849 - Flags: review?(roc)
Comment on attachment 723849 [details] [diff] [review]
Make USE_WIDGET_LAYERS disabled a test failure

Review of attachment 723849 [details] [diff] [review]:
-----------------------------------------------------------------

Awesome!!!
Attachment #723849 - Flags: review?(roc) → review+
https://hg.mozilla.org/mozilla-central/rev/5308a47dd766
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla22
You need to log in before you can comment on or make changes to this bug.