Closed Bug 1549776 Opened 4 months ago Closed 4 months ago

Get wrench reftests running on Android in CI


(Core :: Graphics: WebRender, task, P2)




Tracking Status
firefox69 --- fixed


(Reporter: kats, Assigned: kats)


(Depends on 2 open bugs)



(7 files)

In order to get better testing coverage of WebRender on Android I'd like to run the wrench reftests in CI. This is separate from the gecko reftests (which is tracked by bug 1525314). Ideally we want to run both on an emulator and real devices, although the emulator might be easier to start with.

I was looking at the taskcluster transforms and mozharness scripts that are used to run other test suites on Android, and I have some idea of how things are structured, but it's not totally clear to me what the best approach is for adding wrench reftests.

What we need to is basically take an APK (which is produced as an artifact from a different task), install it on the device/emulator, push the reftests (which are in-tree, and can easily be exported as an artifact as well) to the SD card, and then launch the APK. It will load and run the reftests from the sdcard and emit results to logcat.

I'd like to use some of the mozharness machinery (e.g. the code to download and set up the AVD, and control the device via adb). However using the test transform doesn't seem as appropriate to me because it seems to make some assumptions about build platforms and test platforms and such that don't really apply in this case, because all the standalone webrender/wrench stuff happens separate from the normal gecko builds.

Really what I think I want is to just use mozharness but without mozharness-test, and then teach it about the wrench APK which would treat mostly the same as the geckoview test APK. But I'm open to suggestions on the best way to go about doing this. gbrown/ahal, thoughts?

Flags: needinfo?(gbrown)
Flags: needinfo?(ahal)

(In reply to Kartikaya Gupta ( from comment #0)

Really what I think I want is to just use mozharness but without mozharness-test

I should mention that I started doing this but ran into problems. Specifically: just using mozharness still checks out the gecko tree, which means that this line tries to use ./mach artifact toolchain instead of, and for some reason that fails with this error.

A few quick thoughts:

Flags: needinfo?(gbrown)

(In reply to Geoff Brown [:gbrown] from comment #2)

A few quick thoughts:

I could go with either option, as long as the rest of the steps work. Currently checking out gecko causes the tooltool problem, so maybe not checking it out would be better. But I think checkout: works for run-task type tasks, not mozharness tasks. Once I have using: mozharness the checkout happens unconditionally because it hits the codepath here. I could maybe modify that code to make it conditional, but I wanted to figure out the high-level approach first.

  • most of the mozharness support for android is in AndroidMixin now, which is used by,,, etc -- you're not necessarily locked in to using android_emulator_unittest, if that helps

Yeah, one thing I was considering was writing a new which would use this mixin and just do the things I need it to do. But a chunk of it would be copied from so that's why I wanted to try and use that existing mechanism first.

Yeah that seems like it might get messy. So that seems to be an argument in favour of diverging a little bit and making a which I can do more customized things with.

  • down the road, I think you are going to run into trouble with an architecture conflict: it looks like the android wrench build builds an arm apk, while the mozharness androidx86_7_0 runs an x86_64 avd

The APK actually contains both ARM and x86 libraries and I've verified locally that I can run it on an x86_64 AVD. So I'm hoping I have this problem solved already.

How do you get the tests on the worker? If you don't use a gecko checkout, you might still rely on the build task to package them for you.

From a higher level, maybe you could make a source-test task that runs the mozharness script in its command. That way you could avoid the mozharness-test kind.

Flags: needinfo?(ahal)

(In reply to Andrew Halberstadt [:ahal] from comment #4)

How do you get the tests on the worker? If you don't use a gecko checkout, you might still rely on the build task to package them for you.

I was initially planning to use the gecko checkout but since having the checkout causes the mozharness/tooltool problem I might go the other route and have the build task generate an artifact. But as long as I can get it to work I don't have any strong arguments in favour of one vs the other.

From a higher level, maybe you could make a source-test task that runs the mozharness script in its command. That way you could avoid the mozharness-test kind.

Hm, interesting. Will take a look at this. From reading the description of the source-test kind it doesn't seem super appropriate but maybe the implementation will be easier to work with for my purposes.

Note there are currently source-test tasks that have a build dependency (the python-mochitest/python-reftest ones). You can also use fetches to download an artifact if you don't want to use a dependency.

The easiest path is to just use mozharness-test and either try to hack around the build dependency or just run the builds even though you aren't using them.

So after many try pushes and much fiddling I finally have a working run on taskcluster:

The high-level structure I have is that I'm using run-task as my task configuration transform instead of mozharness or mozharness-test because the latter assume a bunch of gecko-specific test flags during the task configuration. In my run-task command I set a few env vars and then invoke directly, which does some needed setup and then invokes my mozharness script. The mozharness script is similar to but stripped down muchly so it just has the things I need. In particular it doesn't use the "actions" machinery but just invokes the desired steps (e.g. setup_avds, install_apk) manually.

I still need to hook up the logging a bit better so that the TEST-* output in logcat gets mirrored to the main log. is the latest try push. It has all the stuff I want to land, there's just one problem: the jobs are showing as green when they shouldn't be, because there are reftest failures. If you open the reftest analyzer you can see the failures, it's just that whatever component is responsible for parsing job logs and determining pass/fail status and generating the error summary isn't recognizing the UNEXPECTED-FAIL lines in the log. TreeHerder itself seems to be doing it, because if you open the logviewer on the job it highlights the UNEXPECTED-FAIL lines properly.

I'm confused though as to why the output gets parsed fine with the desktop wrench jobs. Those aren't going through structured logging or the mozharness output parser either. They are just running a run-task job completely outside of mozharness and spewing to stdout, and TreeHerder correctly picks up the summary:

I wonder if the return code from the process triggers parsing. I can experiment with that tomorrow.

Yeah it turned out to be the return code. If I exit the process with a nonzero return code it parses as I would expect.

That being said, I guess it would be good to actually use structured logging for all the wrench jobs but at least on the desktop jobs it seems nontrivial to use mozharness to do so. So I'd rather just use unstructured to get the test up and running for now, and then look into making the logs structured without necessarily pulling in all of mozharness.

Typo when I first landed this, but nothing relied on it so it didn't matter.

This makes it so that when running reftests, wrench actually terminates
after a panic rather than just hanging. Termination is detectable and so
we can clean up properly instead of waiting until some other layer hits
a timeout.

Depends on D32009

These tests cause panics in debug mode because of the extra GL error
checking. Tests that are disabled are annotated with the failing
GL call.

Depends on D32011

This adds an script that uses mozharness to
control the Android emulator, and run the wrench reftests. It has an
associated config script which is similar to existing android
config scripts.

The android_emulator_wrench script is structured a little differently
from other android mozharness scripts, mostly for two reasons:

  1. I tried hard to make it locally runnable by developers, using
    ./mach python. This allows develpers to more easily reproduce the
    setup that runs in automation, and does so without duplicating a lot
    of code.

  2. I also tried to make the script use fewer of what I consider to be
    "opaque" mozharness features, like the actions list which can run
    hard-to-find preflight and postflight functions. Instead of treating
    mozharness like a framework and filling in some functions for it to
    invoke as part of it's grand plan, I treat it more like a library and
    specifically the functions I want in the order that I want, which
    makes it easier for novice developers to debug problems.

As part of writing this script I extracted a few helper functions and made
some minor changes to existing android/adb mozharness machinery, but these
are all simple refactorings and should introduce no functional change.

Depends on D32013

Also docs for running the same thing locally.

Depends on D32014 is the green try push. I also added a couple of regular android jobs to make sure I didn't break anything with the mozharness modifications.

The priority flag is not set for this bug.
:jbonisteel, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(jbonisteel)
Blocks: 1525314
Type: defect → task
Priority: -- → P2
See Also: 1525314
Flags: needinfo?(jbonisteel)
Pushed by
Fix path to artifact. r=gw
Ensure debug wrench aborts on panic. r=gw
Disable some reftests on Android. r=gw
Disable some reftests on debug Android. r=gw
Disable more reftests due to failures on Android. r=gw
Add a script to run wrench reftests on an Android emulator. r=gbrown
Add taskcluster jobs for running wrench on Android. r=jrmuizel
You need to log in before you can comment on or make changes to this bug.