Closed Bug 965088 Opened 11 years ago Closed 11 years ago

Update nvidia drivers on Linux test machines

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

x86
Linux
task
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: gw280, Unassigned)

Details

We are still running a completely ancient version of the NVIDIA drivers on our Linux testbots - namely, binary driver 190.42 from 2009 (!!!). The current stable release is 331.38. We should look into either: a) Upgrading our Fedora installation to something that doesn't belong in a museum b) Shoehorning a newer NVIDIA driver onto the ancient version of Fedora b) is what was attempted (and subsequently abandoned) in bug 684165. Maybe we can re-investigate on a new loaner machine to see if the process is any less terrible. I can volunteer to try and make this happen, as I'm a total masochist and love dealing with broken Linux systems and trying to make them work.
Fedora is a shrinking pool of machines that are totally unmanaged. I suggest putting effort into migrating any eleven test jobs to work on our ubuntu hosts first
Migrating any *relevant
We only run the debug mochitest-browser-chrome and b2g reftests on Fedora machines and we have people working on moving away from them on this quarter. Thanks for caring and for the offer! We might need it in the future.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → WONTFIX
So this is going to be a problem -- we have work that needs to land this quarter that's being blocked on broken drivers/tests on those testing machines. What can we do to resolve that? Also, what's the configuration of the destination machines for the move? That is, are they going to run into the same old-drivers problem?
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Just to be clear, we *think* it's a driver issue, but I haven't confirmed it. Do we have a bug tracking progress of switching the b2g emulator test bots over to the Ubuntu hosts?
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #4) > So this is going to be a problem -- we have work that needs to land this > quarter that's being blocked on broken drivers/tests on those testing > ... And by "this quarter", Vlad means the upcoming week :)
Flags: needinfo?(gwright)
Interesting; I didn't get added to the CC list by default. The bugs are bug 818968 and bug 850101. The Fedora machines would need a manual deployment through cssh. This can be a black hole since updating the drivers can affect other tests. We can loan a machine to a developer to verify if it gets fixed with a drivers update (if we really really need to go that way). The Ubuntu machines are m1.medium instances on AWS. Which test suites are you guys looking into? B2G reftests? (In reply to Milan Sreckovic [:milan] from comment #6) > (In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #4) > > So this is going to be a problem -- we have work that needs to land this > > quarter that's being blocked on broken drivers/tests on those testing > > ... > > And by "this quarter", Vlad means the upcoming week :) Lovely!
If the workaround in comment 7 is not good and we can't wait for moving the jobs to AWS (the graphics team is on it), please follow the instructions on https://wiki.mozilla.org/ReleaseEngineering/How_To/Request_a_slave
It doesn't appear to fix my issue
Flags: needinfo?(gwright)
gw280: in bug 959808 you loaned a Fedora machine, would you mind looking into the driver update and have more data to determine what to do in here?
http://www.nvidia.com/download/driverResults.aspx/73099/en-us http://www.nvidia.com/download/driverResults.aspx/73100/en-us Nvidia just released Linux 334.16 beta drivers with a lot of fixes and an important addition. "Added 64-bit EGL and OpenGL ES libraries to 64-bit driver packages." Can we please delay the driver update until Nvidia released a R334.xx certified driver?
Thanks NVD for letting us know, however, I spoke with the gw280 today and we won't proceed with this request as we will be moving these tests to EC2 in the very short term.
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → WONTFIX
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.