Closed Bug 968199 Opened 8 years ago Closed 8 years ago

AWS machines should run b2g emulator reftests with GALLIUM_DRIVER=softpipe

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

x86
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jrmuizel, Assigned: armenzg)

References

Details

Attachments

(2 files, 1 obsolete file)

This causes a lot more of them to pass.
Summary: AWS machines should run emulator reftests with GALLIUM_DRIVER=softpipe → AWS machines should run b2g emulator reftests with GALLIUM_DRIVER=softpipe
Blocks: 818968
We don't want to make this change to http://mxr.mozilla.org/mozilla-central/source/build/mobile/b2gautomation.py#83 because that will affect all machines running the reftests. We only want this on the AWS machines.
Let me see where we can make the change on mozharness.

We have this script and this config:
http://hg.mozilla.org/build/mozharness/file/default/scripts/b2g_emulator_unittest.py
http://hg.mozilla.org/build/mozharness/file/default/configs/b2g/emulator_automation_config.py

We have the reftest-options:
http://hg.mozilla.org/build/mozharness/file/default/configs/b2g/emulator_automation_config.py#l67

    67     "reftest_options": [
    68         "--adbpath=%(adbpath)s", "--b2gpath=%(b2gpath)s", "--emulator=%(emulator)s",
    69         "--emulator-res=800x1000", "--logcat-dir=%(logcat_dir)s"
    70         "--remote-webserver=%(remote_webserver)s", "--ignore-window-size",
    71         "--xre-path=%(xre_path)s", "--symbols-path=%(symbols_path)s", "--busybox=%(busybox)s",
    72         "--total-chunks=%(total_chunks)s", "--this-chunk=%(this_chunk)s",
    73         "%(test_manifest)s",
    74     ],

Is there a way that we could make runreftestb2g.py be able to take parameters to adjust the environment variables?
Flags: needinfo?(ahalberstadt)
Component: Reftest → Platform Support
Product: Testing → Release Engineering
QA Contact: coop
<rail> btw, which driver is broken? maybe it's worth to fix it?
<jrmuizel> rail: llvmpipe
<rail> ok, I have I crazy idea :)
<jrmuizel> rail: but even if we fix it, we'll have to deploy a new driver to the whole ubuntu platform which will be an even more invasive change
<rail> we can set OPENGL_IS_BROKEN_HERE=1 on those machines, and use it in mozharness to set GALLIUM_DRIVER
<rail> so we don't even touch other pieces
* Tomcat|sheriffduty is now known as Tomcat|afk
<rail> if env.get("OPENGL_IS_BROKEN_HERE"): env["GALLIUM_DRIVER"] = "softpipe"
<armenzg> it works for me
* glob is now known as glob|away
<jrmuizel> rail: yeah, that seems reasonable
<rail> ship it! :)
<armenzg> the day that we fix the drivers situation we can unset that env
<rail> yeah
Assignee: nobody → armenzg
Flags: needinfo?(ahalberstadt)
env for AWS Ubuntu machines can be set here: http://hg.mozilla.org/build/puppet/file/5f54cb661801/modules/gui/templates/xvfb.conf.erb#l7

Please add bug number and some comments to make sure we don't delete it a year later. :)
I saw this fly by when syncing with puppet:
--- /etc/init/xvfb.conf	2014-02-05 13:19:48.279608001 -0800
+++ /tmp/puppet-file20140205-2419-670grd-0	2014-02-05 13:24:14.415607999 -0800
@@ -5,6 +5,7 @@
 description "start Xvfb server"
 
 export USER=cltbld
+export OPENGL_IS_BROKEN_HERE=1
 setuid cltbld

However, I can't seem to see the environment being set under the cltbld or root users.
I tried rebooting.
I tried pinning to my environment; I'm not sure if I did it properly.
I still have to test the machine through buildbot on staging.
Attachment #8371857 - Flags: review?(rail)
Attachment #8371857 - Flags: review?(rail) → review+
tst-linux64-ec2-001?

[root@tst-linux64-ec2-001.test.releng.use1.mozilla.com ~]# cat /etc/X11/Xsession.d/98-broken-opengl
cat: /etc/X11/Xsession.d/98-broken-opengl: No such file or directory
I was being lazy.
It is tst-linux64-ec2-armenzg disguised as tst-linux64-ec2-001 rather than having to add it to slavealloc + my master's configs.
The buildbot.tac is static on disk.
Fwiw it would be relatively easy to add the env to the harness, but if you already have it solved in mozharness that wfm :). Having the fix be as close to the slave environment as possible makes sense too.
Comment on attachment 8371857 [details] [diff] [review]
add OPENGL_IS_BROKEN_HERE=1 to EC2 test machines

Live in puppet.
https://hg.mozilla.org/build/puppet/rev/aa0fde115097

The EC2 machines will be picking this change up over the day.
We still have to work on the mozharness patch to set the GALLIUM_DRIVER env variable.

https://tbpl.mozilla.org/?tree=Cedar&jobname=b2g_emulator.*test
Attachment #8371857 - Flags: checked-in+
Jeff, can you please review this log and see if it does what you want? I see the GALLIUM_DRIVER variable.
https://tbpl.mozilla.org/php/getParsedLog.php?id=34295871&tree=Ash&full=1
Flags: needinfo?(jmuizelaar)
Comment on attachment 8372431 [details] [diff] [review]
add GALLIUM_DRIVER=1 to b2g emulator jobs when running on EC2 test machines

I think
Attachment #8372431 - Flags: review?(aki) → review+
Comment on attachment 8372431 [details] [diff] [review]
add GALLIUM_DRIVER=1 to b2g emulator jobs when running on EC2 test machines

http://hg.mozilla.org/build/mozharness/rev/1ac44c9974e7
Attachment #8372431 - Flags: checked-in+
I spoke with Jeff at the office.
Status: NEW → RESOLVED
Closed: 8 years ago
Flags: needinfo?(jmuizelaar)
Resolution: --- → FIXED
I'm wondering if this is causing the ICS emulator crashtests on Cypress to hang 7200 seconds with no output:
https://tbpl.mozilla.org/?tree=Cypress
in production
Backing out. It is making b2g emulator crashtests time out.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attachment #8372431 - Flags: checked-in+ → checked-in-
I've requested jobs being re-triggered for this changeset:
https://tbpl.mozilla.org/?tree=Ash&jobname=b2g_emulator&rev=53bb6beeb1dc

They should be testing this:
http://hg.mozilla.org/users/asasaki_mozilla.com/ash-mozharness/rev/649b6f5c150e

+        if suite_name == 'reftest' and os.environ.get("OPENGL_IS_BROKEN_HERE"):
+            env["GALLIUM_DRIVER"] = "softpipe"

We should see results in less than 2 hours.
It is working now.
Attachment #8372431 - Attachment is obsolete: true
Attachment #8373491 - Flags: review?(aki)
Attachment #8373491 - Flags: review?(aki) → review+
Comment on attachment 8373491 [details] [diff] [review]
add GALLIUM_DRIVER=1 to b2g *reftests* emulator jobs when running on EC2 test machines

http://hg.mozilla.org/build/mozharness/rev/2ba270c1536c
Attachment #8373491 - Flags: checked-in+
Status: REOPENED → RESOLVED
Closed: 8 years ago8 years ago
Resolution: --- → FIXED
Pushed to the production branch:
https://hg.mozilla.org/build/mozharness/rev/7853577f5492
Now that we have deployed a patched mesa we can back this out to take advantage of it:
<jrmuizel> armenzg: we should be able to backout bug 968199 now
<jrmuizel> armenzg: that will be needed for us to take advantage of the new mesa
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment on attachment 8371857 [details] [diff] [review]
add OPENGL_IS_BROKEN_HERE=1 to EC2 test machines

Backed out as well since we don't need it:
https://hg.mozilla.org/build/puppet/rev/2122b8cb3302
Attachment #8371857 - Flags: checked-in+ → checked-in-
Attachment #8373491 - Flags: checked-in+ → checked-in-
Status: REOPENED → RESOLVED
Closed: 8 years ago8 years ago
Resolution: --- → FIXED
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.