Closed Bug 891959 Opened 10 years ago Closed 9 years ago

[Tracking] Stand up Android x86 Automated Testing System using Emulators

Categories

(Testing :: General, defect)

x86
Android
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: cmtalbert, Assigned: kmoir)

References

Details

(Whiteboard: [reit-x86] status-in-comment-19-20)

Attachments

(1 file)

We need to stand up a testing system for x86 emulators running on iX hardware in order to run correctness tests (mochitest, reftest, robocop tests) for our products in the x86 Android OS.

This is the tracking bug for that work.
Depends on: 892118
Depends on: 891200
Depends on: 892123
Depends on: 892688
Depends on: 894507
Depends on: 895186
Depends on: 899605
Depends on: 899614
No longer depends on: 899605
No longer depends on: 899614
Depends on: 902645
Depends on: 904672
Whiteboard: [reit-x86]
Depends on: 909385
Depends on: 889962
Status update:
- we're running 4 emulators on talos-linux64-ix-* machines
- the jobs are running on Cedar and Ash
- some of the tests suites are running properly
- we're figuring out the last issues on Ash - bug 895186
- we should know how close by the end of this week
Depends on: 915870
Blocks: 917324
Depends on: 917361
No longer depends on: 915870
No longer blocks: 917324
Depends on: 917053
Depends on: 916657
We have a new set of builds running on Cedar:
https://tbpl.mozilla.org/?tree=Cedar&jobname=Android%204.2%20x86&rev=dbafecdf652e
A more detailed status about orange suites will be given later.

We're going to use this bug for global status of the project as we have completed the stage "run things on tbpl" and we're now on the phase "green things out and meet sheriffs expectations".

Our main focus is:
* to become visible on tbpl - bug 917361
** bug 915870 - armenzg - fix try server
** bug 916923 - ted     - fix crash dumps
** bug 917324 - unowned - make it easy to run for a developer
** bug 917558 - unowned - intermittent problem
* to fix remaining oranges
** bug 916657 - gbrown - reftests
** bug 917053 - gbrown - webgl
Assignee: nobody → armenzg
We will have a newer update early next week.
I don't expect any major changes this week.
If anyone wants this project completed please find an owner for bug 916923 because gbrown and I cannot help there. I thought ted was going to work on the bug but I had the wrong assumption.

On my side, I've a working patch for the try server (bug 915870). This is a relief because the try_parser.py code was very unfamiliar for me.

Next week I will be buildduty so I won't be working on this actively. You might see some activity from me but not actively working. I will be mainly supporting gbrown's progress.

The week after I assume I should be able to wrap all releng bugs but I won't be able to help with bug 916923 as I mentioned above.
rc1 and rc2 are passing now.

webgl is much improved but still seems to crash. Waiting on more tests, but may need to re-open bug 917053. 

Waiting on a review to disable a test and green up J5 -- bug 917508.

I have started collecting timing info for reftests -- https://tbpl.mozilla.org/?tree=Try&rev=f4083234d6e4&showall=1. Will analyze those results and continue investigating next week.
Depends on: 919812
Depends on: 920221
Improved crash dumps are landing -- bug 916923.

webgl is much improved but still seems to crash -- bug 919784. 
 
Waiting on a review to disable a test and green up J5 -- bug 917508. 

We were missing some command line arguments for reftests -- bug 920221. Those are corrected now and reftests run a little faster: not a significant change; they still run very slowly.
 
I collected high-level timing info for reftests --
https://tbpl.mozilla.org/?tree=Try&rev=8314d212b55d&showall=1. Not surprisingly, it shows that test time on all platforms is dominated by ctx.drawWindow(). Probably best for the gfx team to follow-up: bug 916657.
Depends on: 920627
No longer depends on: 920627
Even if I will have to hide them, I think it is good to move green-ish suites to production so we can get a sense of the load and how ratio of intermittent issues.

I can take care of enabling these next week and hiding them across the board.
Attachment #810661 - Flags: review?(gbrown)
No longer depends on: 892688
Depends on: 917508
gbrown is gone until after the summit so no updates from him until after it.
I will be gone tomorrow for TRIBE and on Monday as buildduty as I've swapped a day with bhearsum

We have found intermittent issues with the actual emulator:
https://bugzilla.mozilla.org/show_bug.cgi?id=917562#c19

Quoting from philor:
> ... until yesterday it was just one crash among many; now that it's one crash among
> many that probably are or maybe aren't all from bogus emulator behavior,
> the thing that should be blocking [bug 917361] is either an unfiled bug to install
> a version of Qemu that doesn't yet exist, or an unfiled bug to
> build our own patched Qemu. And I didn't file either one since I'm the wrong one to be.
Comment on attachment 810661 [details] [diff] [review]
androidx86.configs.diff

I will have to hide the jobs regardless of their status.

This will help me get a better (not complete) understanding of intermittency of these greened up test jobs.
Attachment #810661 - Flags: review?(gbrown) → review?(aki)
Attachment #810661 - Flags: review?(aki) → review+
Assigning to gbrown since a lot of the dependencies are in his hands or people he has to work with.

We're mainly blocked on the requirements of bug 917361.
Assignee: armenzg → gbrown
Whiteboard: [reit-x86] → [reit-x86] status in comment 9
Depends on: 923881
Status update
=============
* Main tree of dependencies https://bugzilla.mozilla.org/showdependencytree.cgi?id=917361&hide_resolved=1
** The critical path issue is the QEMU bug (bug 917562 - comment 60 - froydnj)
* We are going to disable sets1 & sets2 in bug 923881 until we meet the tbpl visibility requirements
* gbrown will keep on making progress on Cedar
** Use this URL to see how close/far we are: https://tbpl.mozilla.org/?tree=Cedar&jobname=Android 4.2 x86

No more status updates will be given until bug 917562 gets knocked off.
Whiteboard: [reit-x86] status in comment 9 → [reit-x86] status in comment 11
Depends on: 928463
Depends on: 929048
Depends on: 935214
Depends on: 936226
Depends on: 937299
Status update
=============
* Tree of dependencies https://bugzilla.mozilla.org/showdependencytree.cgi?id=891959&hide_resolved=1
* We have landed a fix for our most critical issue
** bug 917562 - comment 60 - froydnj
** We will know soon if it gets fixed
* deployed kvm last week to our hosts
* deployed a patched version of Android today
* gbrown has managed to bring us from 8 sets to 3 sets by using kvm
* Use this URL to see how close/far we are: https://tbpl.mozilla.org/?tree=Cedar&jobname=Android.*x86&showall=1
** S3 seems green
** S1 & S2 are still orange/red

I will be making progress on bug 919812 and bug 917324.
Next week I will be at a work week.
If anyone needs a status update wrt to greening the remaining test jobs please poke gbrown directly.
Depends on: 933918
Summary: [Tracking] Stand up x86 Automated Testing System using Emulators → [Tracking] Stand up Android x86 Automated Testing System using Emulators
Depends on: 940399
Depends on: 940441
Depends on: 941788
I'm back to this.
Thursday to Wednesday I will be on duty.

Status update
=============
* Tree of dependencies https://bugzilla.mozilla.org/showdependencytree.cgi?id=891959&hide_resolved=1
* Jobs on Cedar: https://tbpl.mozilla.org/?tree=Cedar&jobname=Android.*x86&showall=1
* Our main goal is to green Cedar out
** I don't currently know if there are anymore releng bugs blocking these jobs to become green (maybe the S1 jobs)
** I believe all remaining issues are on the testing side

= armenzg =
* bug 919812 - deployed a new avd a bit ago
* bug 917324 - this is to make it easier for developers to run Android x86 and meet tbpl's requirements
** aka documentation and maybe some small code adjustments
* bug 939823 - Too much output in Android x86 S1
** it has gone live a bit ago, however, the issue is not completely gone
** more chunking might be needed

= gbrown =
* See bug 936226 (gbrown) - "Green up" remaining test failures for Android x86 emulator for details
** S1 still has too much output
** S2 - mochitest-gl is failing with:
11:37:06     INFO -  DMError: Automation Error: Timeout in command isdir /mnt/sdcard/tests/logs
** S3 seems to be running green
Whiteboard: [reit-x86] status in comment 11 → [reit-x86] status-in-comment-14
Blocks: 917508
No longer depends on: 917508
No longer depends on: 941788
No longer depends on: 936226
I've removed the last blockers for gbrown.
We're now watching to see things green up on Cedar
Please look at the tree of dependencies since we have done a bunch of clean up.

Status update
=============
* Tree of dependencies https://bugzilla.mozilla.org/showdependencytree.cgi?id=891959&hide_resolved=1
* Jobs on Cedar: https://tbpl.mozilla.org/?tree=Cedar&jobname=Android.*x86&showall=1
* Our main goal is to green Cedar out
** As of now, there are no releng bugs blocking greening Cedar up
** I believe all remaining issues are on the testing side

= armenzg =
* bug 919812 - deployed a new avd a bit ago
** I'm doing some puppet clean up work - not blocking anymore
* bug 917324 - this is to make it easier for developers to run Android x86 and meet tbpl's requirements
** aka documentation and maybe some small code adjustments
** not blocking gbrown

= gbrown =
* See bug 936226 (gbrown) - "Green up" remaining test failures for Android x86 emulator for details
Tree: https://bugzilla.mozilla.org/showdependencytree.cgi?id=936226&hide_resolved=1
Whiteboard: [reit-x86] status-in-comment-14 → [reit-x86] status-in-comment-15
Comment 15 still applies as the status update.
I won't be making anymore status updates until some breaking news come from under bug 936226.

Weekly status updates happen on Wednesdays:
https://wiki.mozilla.org/Mobile/Testing/12_04_13#x86_automation
Depends on: 949740
No longer depends on: 949740
Depends on: 944440
Depends on: 957185
Depends on: 960674
Latest news, we're looking into enabling S4 across the board in bug 960674.
No longer depends on: 957185
No longer depends on: 944440
Depends on: 961205
Depends on: 961207
I will grab this bug as an overall bug since gbrown is already tackling bug 917361.
This will make it clearer that is also on releng's plate as a goal.

Status update
=============
We have S4 jobs running across the board:
https://tbpl.mozilla.org/?tree=Mozilla-Inbound&jobname=android.*x86

* Tree of dependencies https://bugzilla.mozilla.org/showdependencytree.cgi?id=891959&hide_resolved=1
* Jobs on Cedar: https://tbpl.mozilla.org/?tree=Cedar&jobname=Android.*x86&showall=1
* Cedar seems green but we might have some intermittent failures
** More info under bug 936226 - "Green up" remaining test failures for Android x86 emulator for details
Tree: https://bugzilla.mozilla.org/showdependencytree.cgi?id=936226&hide_resolved=1
Assignee: gbrown → armenzg
Populating gbrown's status from bug 936226:

Status update:

S1 and S2 fail intermittently due to bug 927602. Improvements to logging and diagnostics have failed to point to a cause. More improvements coming in bug 960265 and bug 963838. More experiments under way on the loaner, on a low priority basis. Logcat analysis of S2 might be fruitful.

S3 is green except for intermittent NSS crashes on shutdown, bug 963317, which also affects other Android and Windows jobs. Bug 941788 (crashes in web-gl tests) remains a concern, but frequency is now very low.

S4 is running trouble-free on trunk.
Whiteboard: [reit-x86] status-in-comment-15 → [reit-x86] status-in-comment-19-20
Assignee: armenzg → kmoir
I'm going to close this.  Bug 917361 is still open ( Meet tbpl's visibility requirements for Androidx86 ) but I think the substantive work is done.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.