"Green up" test failures for Android 2.3 emulator

RESOLVED FIXED in mozilla33

Status

defect
RESOLVED FIXED
6 years ago
5 years ago

People

(Reporter: gbrown, Assigned: gbrown)

Tracking

unspecified
mozilla33
ARM
Android
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(13 attachments)

7.54 KB, patch
armenzg
: review+
Details | Diff | Splinter Review
4.15 KB, patch
armenzg
: review+
Details | Diff | Splinter Review
6.39 KB, patch
armenzg
: review+
Details | Diff | Splinter Review
925 bytes, patch
armenzg
: review+
Details | Diff | Splinter Review
9.96 KB, patch
jmaher
: review+
Details | Diff | Splinter Review
3.77 KB, patch
armenzg
: review+
Details | Diff | Splinter Review
764 bytes, patch
jmaher
: review+
Details | Diff | Splinter Review
1.20 KB, patch
kmoir
: review+
Details | Diff | Splinter Review
2.08 KB, patch
kmoir
: review+
Details | Diff | Splinter Review
6.86 KB, patch
kmoir
: review+
Details | Diff | Splinter Review
8.08 KB, patch
kmoir
: review+
Details | Diff | Splinter Review
3.10 KB, patch
kmoir
: review+
Details | Diff | Splinter Review
5.16 KB, patch
kmoir
: review+
Details | Diff | Splinter Review
The first set of tests ran on Ash today -- all are orange.
Depends on: 967913
Depends on: 875814
I tried increasing the number of mochitest chunks to 8 in androidarm.py and the first 2 chunks completed on Ash! I want to try running all the mochitests now, in 8 chunks.

Similarly, I think there is a chance of running the crashtests in 3 chunks.

Since I'm here, let's enable blobber too -- the complete logcats may be helpful.
Attachment #8371050 - Flags: review?(armenzg)
Comment on attachment 8371050 [details] [diff] [review]
[checked-in] run 2.3 mochitest in 8 chunks, crashtest in 3 chunks; enable blobber

Review of attachment 8371050 [details] [diff] [review]:
-----------------------------------------------------------------

Excellent! Please land it as I will be in a meeting.
I hope we have a reconfig this afternoon.
Attachment #8371050 - Flags: review?(armenzg) → review+
Depends on: 969612
Depends on: 969624
Depends on: 970336
in production
Depends on: 970409
Blocks: 971176
gbrown, what is our current status? (no rush)
https://tbpl.mozilla.org/?tree=Ash&jobname=Android%202.3
All tests run slower on the 2.3 emulator than on devices (tegra or panda). The difference is most dramatic for reftests; :dminor is investigating reftest speed on 2.3.

For the remaining tests, chunking appears to be viable. 

Splitting mochitests into 8 almost worked, but not quite. M4/8 timed out here: https://tbpl.mozilla.org/php/getParsedLog.php?id=34428514&tree=Ash&full=1. Increasing to 10 chunks might work: M4/10 - https://tbpl.mozilla.org/php/getParsedLog.php?id=34456080&tree=Ash&full=1 and M5/10 - https://tbpl.mozilla.org/php/getParsedLog.php?id=34456783&tree=Ash. 

Splitting crashtests into 3 almost worked, but not quite. I think 4 chunks will be sufficient.

xpcshell tests do not timeout, but run for about 180 minutes - https://tbpl.mozilla.org/php/getParsedLog.php?id=34456493&tree=Ash&full=1; I may split those just so that the job completes in a more typical/reasonable time.

I have not noticed any robocop timeouts with 3 chunks but each chunk runs for about an hour currently, so 4 or 5 chunks may be warranted.

There are a few dozen non-timeout test failures. I have fail-if/skip-if patches for some of those and am working on more (see dependent bugs), but am holding off on check-in as there is concern around use of android_version in manifests and mochitest manifests are changing. I hope to look at robocop failures this week.


At this time, most jobs are still orange, but there are some green, and several are "almost-green". My main concern is reftest slowness.
This is basically a follow-up to Comment 6, to split more tests into more chunks to eliminate a few more timeouts. I'm trying to get:
 - 10 mochitest chunks
 - 4 crashtest chunks
 - 3 xpcshell chunks
 - 4 robocop chunks
I am pretty confident this will eliminate job timeouts in these suites. (We'll deal with reftests and jsreftests another day.)
Attachment #8375797 - Flags: review?(armenzg)
Attachment #8375797 - Flags: review?(armenzg) → review+
Attachment #8375798 - Flags: review?(armenzg) → review+
Comment on attachment 8375797 [details] [diff] [review]
[checked-in] mobile_config.py changes for Android 2.3 chunking

https://hg.mozilla.org/build/buildbot-configs/rev/d2de53997e3c
Attachment #8375797 - Attachment description: mobile_config.py changes for Android 2.3 chunking → [checked-in] mobile_config.py changes for Android 2.3 chunking
Comment on attachment 8375798 [details] [diff] [review]
[checked-in] androidarm.py changes for more chunks

https://hg.mozilla.org/build/mozharness/rev/0a41263b6b5a
Attachment #8375798 - Attachment description: androidarm.py changes for more chunks → [checked-in] androidarm.py changes for more chunks
Merge to mozharness production.
Live in production.
Depends on: 973601
Depends on: 948591
Depends on: 975155
Depends on: 975187
Depends on: 975487
I found that blobber was not working for Android 2.3 jobs. I have verified on ash that adding these config entries enables blobber for Android 2.3.
Attachment #8381520 - Flags: review?(armenzg)
Attachment #8381520 - Flags: review?(armenzg) → review+
Comment on attachment 8381520 [details] [diff] [review]
[checked-in] androidarm.py changes to enable blobber for Android 2.3

https://hg.mozilla.org/build/mozharness/rev/d656793dd271
Attachment #8381520 - Attachment description: androidarm.py changes to enable blobber for Android 2.3 → [checked-in] androidarm.py changes to enable blobber for Android 2.3
Attachment #8371050 - Attachment description: run 2.3 mochitest in 8 chunks, crashtest in 3 chunks; enable blobber → [checked-in] run 2.3 mochitest in 8 chunks, crashtest in 3 chunks; enable blobber
In production.
Current status:
 - Plain reftests and js-reftests run too slow and fail with timeouts; bug 967913
 - mochitest-gl fails consistently; jgilbert has been need-info'd in bug 975487
 - The remaining test suites generally run green on ash. I'm watching for and cleaning up a few remaining infrequent intermittent failures before considering eligibility for trunk trees.
 - Robocop tests are passing consistently on ash, but only after disabling more than half of the tests. Investigation suggests that most of these failures are reproducible outside of the test environment: The browser crashes or hangs during routine operation in the 2.3 emulator. eg. bug 975155, bug 975187
Depends on: 891347
Depends on: 978254
Depends on: 978257
Depends on: 978265
Depends on: 978277
Depends on: 979548
Depends on: 979552
Depends on: 979597
Depends on: 979600
Depends on: 979603
Depends on: 979612
Depends on: 979615
Depends on: 979617
Depends on: 979620
Depends on: 979621
Blocks: 979921
No longer blocks: 979921
Disabled tests now tracked in bug 979921.
Depends on: 975631
Depends on: 980498
Depends on: 981173
Adds testing/mochitest/android23.json, a copy of android.json with additional test exclusions required to get green runs of mochitests on Android 2.3.
Attachment #8388865 - Flags: review?(jmaher)
We need to switch over to an Android 2.3-specific manifest for 2.3 mochitests.
Attachment #8388874 - Flags: review?(armenzg)
Comment on attachment 8388865 [details] [diff] [review]
[checked-in] add android23.json

Review of attachment 8388865 [details] [diff] [review]:
-----------------------------------------------------------------

I think this is great.  I have a patch which gets rid of the android.json excludetests, we should coordinate tomorrow.
Attachment #8388865 - Flags: review?(jmaher) → review+
Attachment #8388865 - Attachment description: add android23.json → [checked-in] add android23.json
Attachment #8388874 - Flags: review?(armenzg) → review+
https://hg.mozilla.org/mozilla-central/rev/72a2f1fcea40
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: [leave open]
Attachment #8388874 - Attachment description: use android23.json for Android 2.3 mochitests → [checked in] use android23.json for Android 2.3 mochitests
No longer blocks: 910092
Oops - forgot something!
Attachment #8389545 - Flags: review?(jmaher)
Comment on attachment 8389545 [details] [diff] [review]
[checked-in] add android23.json to Makefile.in

Review of attachment 8389545 [details] [diff] [review]:
-----------------------------------------------------------------

this is an easy patch.  for reference, I just committed a changed to mozilla-inbound which modified a lot of manifests for android to remove all but 1 entry in android.json and reduce duplication where possible in all 3.
Attachment #8389545 - Flags: review?(jmaher) → review+
Attachment #8389545 - Attachment description: add android23.json to Makefile.in → [checked-in] add android23.json to Makefile.in
buildbot reconfigured -> in production
Depends on: 983417
Depends on: 985155
What is our current status?
Are there any suites that we could start running on trunk branches?
(In reply to Armen Zambrano [:armenzg] (Release Engineering) (EDT/UTC-4) from comment #29)
> What is our current status?
> Are there any suites that we could start running on trunk branches?

We are not quite there yet -- but likely very close to turning on some suites.

Bug 980498 (no crashdumps) potentially affects all suites, so prevents us from running anything on trunk at the moment. I expect it will be fixed by tomorrow.

Bug 985155 (intermittent crashes) is a concern but is of sufficiently low frequency that we may be able to proceed without resolving it.

Bug 975631 (friendly xpcshell chunk names on tbpl) is not resolved, but improvements are good enough that we should be okay on trunk, IMO.

I need to review recent robocop failures and perhaps disable another couple of tests -- I should be able to do that today.

I hope that will put us in a good position to turn on xpcshell, crashtests, and robocop on trunk early next week.

Mochitests are not quite stable, but close. One issue is that a couple of chunks take nearly 1 hour to complete and occasionally timeout. Moving to m3.large ec2 instances would resolve that, but perhaps that is a longer-term problem? Should I just increase the number of mochitest chunks for now?

Reftests are too slow and need more investigation -- we will not be running reftests or jsreftests on trunk for a few weeks yet.
Disable a few more robocop and mochitests:

https://hg.mozilla.org/integration/mozilla-inbound/rev/dbed599ec413

mochitests may remain unstable since webgl tests are running -- I need to check with :jmaher about those -- and because some chunks may exceed the 1 hour maximum run time.
Whiteboard: [leave open] → [leave open] status-in-comment-30
Android 2.3 crashtests run in 4 chunks currently, with each chunk taking about 50 minutes to complete. Occasionally, a chunk runs slower (I don't know why!) and fails to complete in the 60 minute timeout:

https://tbpl.mozilla.org/php/getParsedLog.php?id=36517906&tree=Ash&full=1#error0

Let's run crashtests in 5 minutes to avoid this unpleasantness.
Attachment #8395113 - Flags: review?(kmoir)
Depends on: 986738
Attachment #8395113 - Flags: review?(kmoir) → review+
Attachment #8395114 - Flags: review?(kmoir) → review+
Attachment #8395113 - Attachment description: mobile_config.py changes for crashtest-5 → [checked-in] mobile_config.py changes for crashtest-5
Attachment #8395114 - Attachment description: mozharness config changes for crashtest-5 → [checked-in] mozharness config changes for crashtest-5
Depends on: 987759
something here is in production
mochitest-4 and mochitest-5 sometimes exceed 60 minutes of run time and time-out. I am increasing the mochitest chunks from 10 to 12, hoping that will give us enough time to avoid the intermittent failures.

We only run a small percentage of reftests and jsreftests currently. I want to increase the reftest chunks from 3 to 10 and the number of jsreftest chunks from 1 to 6. I do not think this will allow them to run to completion, but will help us get a better idea of how long they take to run.
Attachment #8397147 - Flags: review?(kmoir)
Attachment #8397147 - Flags: review?(kmoir) → review+
Attachment #8397149 - Flags: review?(kmoir) → review+
Attachment #8397147 - Attachment description: mobile_config.py changes for more M/R/J chunks → [checked-in] mobile_config.py changes for more M/R/J chunks
Attachment #8397149 - Attachment description: mozharness config changes for more M/R/J chunks → [checked-in] mozharness config changes for more M/R/J chunks
The checkins from c#39 are now live
Depends on: 988657
Status update:

crash reporting - Works fine now.
crashes - 2 intermittent crashes affect all tests - bug 985155 and bug 986738 - but occur with sufficiently low frequency that they should not prevent running on trunk trees.
xpcshell tests - Stable and ready for trunk. Very few tests skipped.
mochitests - A manifest update is landing today, to address the webgl issue - comment 31 - and reduce low-frequency intermittent failures in media tests. With this change, all mochitests - except mochitest-gl - are stable and ready for trunk. Many media tests are skipped.
robocop - A fix is landing today to address a new issue causing frequent failures. With this change, robocop tests are stable and ready for trunk. Many robocop tests are skipped.
crashtests - All tests pass reliably. Recently, many tests in C2 started taking longer to complete than they have in past weeks, causing C2 to often exceed the 60 minute job timeout. C2 can be run reliably by disabling 50 to 100 tests, or likely by adding more chunks, but neither solution appeals to me. Few crashtests are skipped. 
reftests - Remain slow. Running in 10 chunks, about 40% of the tests are run. Needs more investigation, or faster slaves. ix slaves may be the way forward. Most tests pass, but reftest manifests need some work once we get all the tests running.
jsreftests - Remain slow. Running in 6 chunks, about 40% of the tests are run. Basically the same as reftests.

I intend to file a request today to run xpcshell tests, mochitests, and robocop on trunk trees.
Whiteboard: [leave open] status-in-comment-30 → [leave open] status-in-comment-41
Depends on: 989462
A couple new failures cropped up -- skipped: https://hg.mozilla.org/integration/mozilla-inbound/rev/b4edb59d3db5
Unfortunately since bug 985155 is occurring fairly frequently and the stacks aren't correct (and as such means sheriffs have to manually star the failures), we're going to have to hide Android 2.3 tests on all trees for now (particularly since we need to free up time to side by side treeherder).

Mochitests and robocop Android 2.3 tests hidden on mozilla-central, mozilla-inbound, b2g-inbound, fx-team & try.
In preparation for running reftests on ix, here's a reftest manifest update:

https://hg.mozilla.org/integration/mozilla-inbound/rev/34ee2b8302aa
It seems that xpcshell have been hidden as well.  Is this for the same reason as the other suites?
Flags: needinfo?(emorley)
(In reply to Mark Côté ( :mcote ) from comment #49)
> It seems that xpcshell have been hidden as well.  Is this for the same
> reason as the other suites?

Yeah for Bug 985155 as well sadly
Flags: needinfo?(emorley)
Unhidden.
Minor plain-reftest manifest adjustment: https://hg.mozilla.org/integration/mozilla-inbound/rev/ad32581cd49a
Depends on: 941788
Status update:

 - mochitest, robocop, xpcshell are running on trunk trees, on aws slaves
 - jsreftests and crashtests are running reliably on ash, on ix slaves
 - plain-reftests are running reliably on ash, on ix slaves, except for a frequent crash affecting a large group of tests in R5; I think this is the same as bug 941788
Whiteboard: [leave open] status-in-comment-41 → [leave open] status-in-comment-53
Depends on: 1000538
No longer depends on: 941788
Status update:

 - mochitest, robocop, xpcshell are running on trunk trees, on aws slaves
 - jsreftests and crashtests are running reliably on ash, on ix slaves
 - plain-reftests are running reliably on ash, on ix slaves, except that R5 takes just a little too long

To-do:
 - adjust chunking for R5 - bug 967913
 - schedule js/crash/plain-reftests on trunk, on ix slaves
 - fix mochitest-gl - bug 984523
 - investigate disabled tests, especially robocop - bug 979921

Bonus:
 - cpp unit tests? (never run on tegras)
 - armv6?
 - new avds - bug 989343
Whiteboard: [leave open] status-in-comment-53 → [leave open] status-in-comment-55
Depends on: 1004791
Status update:
 - all Android 2.2 functional tests (mochitests, m-gl, robocop, xpcshell, crashtests, js-reftests, plain reftests) are now running on Android 2.3 on trunk, on ix slaves

To-do/bonus:
 - investigate disabled tests, especially robocop - bug 979921
 - Android 2.3 armv6 - bug 1005956
 - cpp unit tests?
 - new avds - bug 989343
Keywords: leave-open
Whiteboard: [leave open] status-in-comment-55 → status-in-comment-56
Mochitests run faster on ix slaves, as compared to aws, so we can run them in fewer chunks now. All chunks currently run in less than 20 minutes. 

Running them in 8 chunks instead of 12 will be more efficient -- less job setup time overall -- and will also make Android 2.3 M chunking consistent with Android 4.0 (handy for try).

We can also reduce the crashtest chunks from 3 to 2.
Attachment #8428803 - Flags: review?(kmoir)
Attachment #8428804 - Flags: review?(kmoir) → review+
Comment on attachment 8428803 [details] [diff] [review]
[checked-in] mobile_config.py changes for fewer M/C chunks

The builder diff looks good on my test master
Attachment #8428803 - Flags: review?(kmoir) → review+
Attachment #8428803 - Attachment description: mobile_config.py changes for fewer M/C chunks → [checked-in] mobile_config.py changes for fewer M/C chunks
Attachment #8428804 - Attachment description: mozharness config changes for fewer M/C chunks → [checked-in] mozharness config changes for fewer M/C chunks
Merged to production and deployed.
Depends on: 1020970
No longer depends on: 975631
All done here: tests are running on trunk.

Some spin-off tasks will be pursued in other bugs:

 - follow-up on disabled tests: bug 979921
 - cpp unit tests: bug 1021986
 - updated avds: bug 989343
Status: REOPENED → RESOLVED
Closed: 5 years ago5 years ago
Keywords: leave-open
Resolution: --- → FIXED
Whiteboard: status-in-comment-56
Target Milestone: mozilla30 → mozilla33
Depends on: 1032783
Depends on: 1033459
You need to log in before you can comment on or make changes to this bug.