Closed Bug 1123828 Opened 6 years ago Closed 6 years ago

Re-order long-running setup operations for Android emulator tests

Categories

(Testing :: General, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: gbrown, Assigned: gbrown)

References

Details

Attachments

(1 file)

I notice that android emulator test jobs often wait a couple of minutes for the sutAgent to start on the emulator. As soon as the sutAgent has started, the job continues with other setup operations, such as lengthy downloads.

I think we can start the emulator and proceed with downloads or other long-running setup operations, then circle back and check on the sutAgent -- hopefully it will have started concurrently, reducing overall setup time.
Depends on: 1164866
I'm sorry about the size of this patch...it got away from me! I ended up re-writing a lot of android_emulator_unittest.py; I like the result.

The main change here is that now the script launches the emulator in the start-emulator step, but makes no attempt to communicate with the emulator until the new verify-emulator step. In between start-emulator and verify-emulator, the download-and-extract and create-virtualenv steps run. The hope is that the emulator can startup during this time so that it is ready to respond when verify-emulator runs.

The verify-emulator step implements a lot of the same checks currently run, adds a few more, and retries whenever something fails. To get through verify-emulator, we check that:
 - "adb wait-for-device" completes
 - "adb devices" includes the emulator name
 - "adb getprop sys.boot_completed" returns 1
 - telnet to the emulator port succeeds
 - if using sutagent, telnet to the sutagent port succeeds

If something like telnet fails or times out, we generally retry that operation up to a certain max number of times, with pauses in between attempts. If retries are exhausted, we kill the emulator, restart the emulator, then try to run through all the verification steps again.

I noticed a lot of code duplication for running adb, telnet-ing, and retrying, so factored out as much of that as I could into utility functions and generally cleaned up the script.

Here's a try run:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=ee1d3a48e925

2.3 logs (with sut) show something like this:

7:08:30     INFO - #####
17:08:30     INFO - ##### Running verify-emulator step.
17:08:30     INFO - #####
17:08:30     INFO - Running pre-action listener: _resource_record_pre_action
17:08:30     INFO - Running main action method: verify_emulator
17:08:30     INFO - >> Check emulator: Attempt #1 of 3
17:08:30     INFO - Verifying adb connectivity
17:08:30     INFO - Running timeout 300 /builds/slave/test/build/android-sdk18/platform-tools/adb wait-for-device
17:08:36     INFO - * daemon not running. starting it now on port 5037 *
17:08:36     INFO - * daemon started successfully *
17:08:36     INFO - Running timeout 300 /builds/slave/test/build/android-sdk18/platform-tools/adb devices
17:08:36     INFO - List of devices attached 
17:08:36     INFO - emulator-5554	device
17:08:36     INFO - >> Verify Android boot completed: Attempt #1 of 30
17:08:36     INFO - Running timeout 300 /builds/slave/test/build/android-sdk18/platform-tools/adb -s emulator-5554 shell getprop sys.boot_completed
17:08:36     INFO - 1
17:08:36     INFO - >> Verify telnet to emulator: Attempt #1 of 4
17:08:36     INFO - Connected to port 5554
17:08:36     INFO - Android Console: type 'help' for a list of commands
17:08:36     INFO - OK
17:08:36     INFO - avd status: 
17:08:36     INFO - virtual device is running
17:08:36     INFO - OK
17:08:36     INFO - redir add tcp:20701:20701: 
17:08:36     INFO - OK
17:08:36     INFO - redir add tcp:20700:20700: 
17:08:36     INFO - OK
17:08:36     INFO - redir list: 
17:08:36     INFO - tcp:20701 => 20701
17:08:36     INFO - tcp:20700 => 20700
17:08:36     INFO - OK
17:08:36     INFO - network status: 
17:08:36     INFO - Current network status:
17:08:36     INFO -   download speed:          0 bits/s (0.0 KB/s)
17:08:36     INFO -   upload speed:            0 bits/s (0.0 KB/s)
17:08:36     INFO -   minimum latency:  0 ms
17:08:36     INFO -   maximum latency:  0 ms
17:08:36     INFO - OK
17:08:36     INFO - >> Verify telnet to SUT agent: Attempt #1 of 8
17:08:36     INFO - Connected to port 20701
17:08:36     INFO - Trying again after EOF
17:08:36     INFO - Sleeping 30 seconds
17:09:06     INFO - >> Verify telnet to SUT agent: Attempt #2 of 8
17:09:06     INFO - Connected to port 20701
17:09:06     INFO - SUT response: $>

That seems better than current logs, which often take 5 or 6 attempts to contact SUT.

4.3 logs tend to wait for boot_completed about the same as existing logs, but perhaps a little better on average:

17:20:15     INFO - #####
17:20:15     INFO - ##### Running verify-emulator step.
17:20:15     INFO - #####
17:20:15     INFO - Running pre-action listener: _resource_record_pre_action
17:20:15     INFO - Running main action method: verify_emulator
17:20:15     INFO - >> Check emulator: Attempt #1 of 3
17:20:15     INFO - Verifying adb connectivity
17:20:15     INFO - Running timeout 300 /builds/slave/test/build/android-sdk18/platform-tools/adb wait-for-device
17:20:21     INFO - * daemon not running. starting it now on port 5037 *
17:20:21     INFO - * daemon started successfully *
17:20:21     INFO - Running timeout 300 /builds/slave/test/build/android-sdk18/platform-tools/adb devices
17:20:21     INFO - List of devices attached 
17:20:21     INFO - emulator-5554	device
17:20:21     INFO - >> Verify Android boot completed: Attempt #1 of 30
17:20:21     INFO - Running timeout 300 /builds/slave/test/build/android-sdk18/platform-tools/adb -s emulator-5554 shell getprop sys.boot_completed
17:20:22     INFO - Sleeping 10 seconds
17:20:32     INFO - >> Verify Android boot completed: Attempt #2 of 30
17:20:32     INFO - Running timeout 300 /builds/slave/test/build/android-sdk18/platform-tools/adb -s emulator-5554 shell getprop sys.boot_completed
17:20:32     INFO - Sleeping 10 seconds
17:20:42     INFO - >> Verify Android boot completed: Attempt #3 of 30
17:20:42     INFO - Running timeout 300 /builds/slave/test/build/android-sdk18/platform-tools/adb -s emulator-5554 shell getprop sys.boot_completed
17:20:42     INFO - Sleeping 10 seconds
17:20:52     INFO - >> Verify Android boot completed: Attempt #4 of 30
17:20:52     INFO - Running timeout 300 /builds/slave/test/build/android-sdk18/platform-tools/adb -s emulator-5554 shell getprop sys.boot_completed
17:20:53     INFO - Sleeping 10 seconds
17:21:03     INFO - >> Verify Android boot completed: Attempt #5 of 30
17:21:03     INFO - Running timeout 300 /builds/slave/test/build/android-sdk18/platform-tools/adb -s emulator-5554 shell getprop sys.boot_completed
17:21:03     INFO - Sleeping 10 seconds
17:21:13     INFO - >> Verify Android boot completed: Attempt #6 of 30
17:21:13     INFO - Running timeout 300 /builds/slave/test/build/android-sdk18/platform-tools/adb -s emulator-5554 shell getprop sys.boot_completed
17:21:13     INFO - Sleeping 10 seconds
17:21:23     INFO - >> Verify Android boot completed: Attempt #7 of 30
17:21:23     INFO - Running timeout 300 /builds/slave/test/build/android-sdk18/platform-tools/adb -s emulator-5554 shell getprop sys.boot_completed
17:21:23     INFO - 1
17:21:23     INFO - >> Verify telnet to emulator: Attempt #1 of 4
17:21:23     INFO - Connected to port 5554
17:21:23     INFO - Android Console: type 'help' for a list of commands
17:21:23     INFO - OK
17:21:23     INFO - avd status: 
17:21:23     INFO - virtual device is running
17:21:23     INFO - OK
17:21:23     INFO - redir list: 
17:21:23     INFO - no active redirections
17:21:23     INFO - OK
17:21:23     INFO - network status: 
17:21:23     INFO - Current network status:
17:21:23     INFO -   download speed:          0 bits/s (0.0 KB/s)
17:21:23     INFO -   upload speed:            0 bits/s (0.0 KB/s)
17:21:23     INFO -   minimum latency:  0 ms
17:21:23     INFO -   maximum latency:  0 ms
17:21:23     INFO - OK
Attachment #8609116 - Flags: review?(kmoir)
Comment on attachment 8609116 [details] [diff] [review]
re-order startup operations, cleanup, etc

nice job, much cleaner code
Attachment #8609116 - Flags: review?(kmoir) → review+
These changes started running on m-c 2015-05-22.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Blocks: 1051908
Duplicate of this bug: 993146
You need to log in before you can comment on or make changes to this bug.