Closed
Bug 949740
Opened 11 years ago
Closed 11 years ago
Intermittent Android x86 We have not been able to establish a telnet connection with the emulator
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task)
Tracking
(firefox27 unaffected, firefox28 unaffected, firefox29 fixed, firefox-esr24 unaffected)
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
firefox27 | --- | unaffected |
firefox28 | --- | unaffected |
firefox29 | --- | fixed |
firefox-esr24 | --- | unaffected |
People
(Reporter: gbrown, Assigned: gbrown)
References
Details
(Keywords: intermittent-failure)
Attachments
(3 files, 1 obsolete file)
7.08 KB,
patch
|
armenzg
:
review+
|
Details | Diff | Splinter Review |
2.34 KB,
patch
|
armenzg
:
review+
|
Details | Diff | Splinter Review |
5.29 KB,
patch
|
armenzg
:
review+
|
Details | Diff | Splinter Review |
Consider:
https://tbpl.mozilla.org/?tree=Cedar&showall=1&rev=5dc615915eb6
There are 3 instances of this error in 22 runs of S4.
We can probably live with that, but I wonder if we could do better.
Assignee | ||
Comment 1•11 years ago
|
||
15:15:30 INFO - #####
15:15:30 INFO - ##### Running start-emulators step.
15:15:30 INFO - #####
15:15:30 INFO - Running pre-action listener: _resource_record_pre_action
15:15:30 INFO - Running main action method: start_emulators
15:15:30 INFO - Let's kill every process called compiz
15:15:30 INFO - Killing pid 2393.
15:15:30 INFO - Created temp file /tmp/tmpzo7j_0.
15:15:30 INFO - Trying to start the emulator with this command: emulator -avd test-1 -debug init,console,gles,memcheck,adbserver,adbclient,adb,avd_config,socket -port 5554 -qemu -m 1024 -enable-kvm
15:15:30 INFO - Sleeping 10 seconds
15:15:40 INFO - Attempt #1 to redirect ports: (5554, 20701, 20700)
15:15:40 INFO - Trying again after exception: [Errno 111] Connection refused
15:15:40 INFO - Sleeping 30 seconds
15:16:10 INFO - Attempt #2 to redirect ports: (5554, 20701, 20700)
15:16:10 INFO - Trying again after exception: [Errno 111] Connection refused
15:16:10 INFO - Sleeping 30 seconds
15:16:40 INFO - Attempt #3 to redirect ports: (5554, 20701, 20700)
15:16:40 INFO - Trying again after exception: [Errno 111] Connection refused
15:16:40 INFO - Sleeping 30 seconds
15:17:10 INFO - Attempt #4 to redirect ports: (5554, 20701, 20700)
15:17:10 INFO - Trying again after exception: [Errno 111] Connection refused
15:17:10 INFO - Sleeping 30 seconds
15:17:40 INFO - Attempt #5 to redirect ports: (5554, 20701, 20700)
15:17:40 INFO - Trying again after exception: [Errno 111] Connection refused
15:17:40 FATAL - We have not been able to establish a telnet connection with the emulator
15:17:40 FATAL - Running post_fatal callback...
15:17:40 INFO - Let's kill every process called emulator64-x86
15:17:40 INFO - Killing pid 2718.
15:17:40 FATAL - Exiting -1
Assignee | ||
Comment 2•11 years ago
|
||
In comparison, this seems typical on success:
09:23:19 INFO - #####
09:23:19 INFO - ##### Running start-emulators step.
09:23:19 INFO - #####
09:23:19 INFO - Running pre-action listener: _resource_record_pre_action
09:23:19 INFO - Running main action method: start_emulators
09:23:19 INFO - Let's kill every process called compiz
09:23:19 INFO - Killing pid 2415.
09:23:19 INFO - Created temp file /tmp/tmpgzGPNu.
09:23:19 INFO - Trying to start the emulator with this command: emulator -avd test-1 -debug init,console,gles,memcheck,adbserver,adbclient,adb,avd_config,socket -port 5554 -qemu -m 1024 -enable-kvm
09:23:19 INFO - Sleeping 10 seconds
09:23:29 INFO - Attempt #1 to redirect ports: (5554, 20701, 20700)
09:23:29 INFO - test-1: 5554; sut port: 20701/20700
09:23:29 INFO - Created temp file /tmp/tmpXjYJmL.
09:23:29 INFO - Trying to start the emulator with this command: emulator -avd test-2 -debug init,console,gles,memcheck,adbserver,adbclient,adb,avd_config,socket -port 5556 -qemu -m 1024 -enable-kvm
09:23:29 INFO - Sleeping 10 seconds
09:23:39 INFO - Attempt #1 to redirect ports: (5556, 20703, 20702)
09:23:39 INFO - test-2: 5556; sut port: 20703/20702
09:23:39 INFO - Created temp file /tmp/tmpqkRu5X.
09:23:39 INFO - Trying to start the emulator with this command: emulator -avd test-3 -debug init,console,gles,memcheck,adbserver,adbclient,adb,avd_config,socket -port 5558 -qemu -m 1024 -enable-kvm
09:23:39 INFO - Sleeping 10 seconds
09:23:49 INFO - Attempt #1 to redirect ports: (5558, 20705, 20704)
09:23:49 INFO - test-3: 5558; sut port: 20705/20704
09:23:49 INFO - Created temp file /tmp/tmp49YwFe.
09:23:49 INFO - Trying to start the emulator with this command: emulator -avd test-4 -debug init,console,gles,memcheck,adbserver,adbclient,adb,avd_config,socket -port 5560 -qemu -m 1024 -enable-kvm
09:23:49 INFO - Sleeping 10 seconds
09:23:59 INFO - Attempt #1 to redirect ports: (5560, 20707, 20706)
09:23:59 INFO - test-4: 5560; sut port: 20707/20706
Assignee | ||
Comment 3•11 years ago
|
||
I wonder if the emulator log would show anything interesting in this case: let's dump it.
Attachment #8347499 -
Flags: review?(armenzg)
Updated•11 years ago
|
Attachment #8347499 -
Flags: review?(armenzg) → review+
Assignee | ||
Comment 4•11 years ago
|
||
Assignee | ||
Comment 5•11 years ago
|
||
That first patch didn't quite work: https://tbpl.mozilla.org/php/getParsedLog.php?id=32032518&tree=Cedar&full=1#error1
That's because emulator_procs has not been populated yet. A little re-arranging of the logic should get this working.
Attachment #8348234 -
Flags: review?(armenzg)
Assignee | ||
Comment 6•11 years ago
|
||
oops - attached the wrong patch
Attachment #8348234 -
Attachment is obsolete: true
Attachment #8348234 -
Flags: review?(armenzg)
Attachment #8348239 -
Flags: review?(armenzg)
Assignee | ||
Updated•11 years ago
|
Attachment #8347499 -
Attachment description: dump emulator log if _redirectSUT fails → (1) dump emulator log if _redirectSUT fails
Updated•11 years ago
|
Attachment #8348239 -
Flags: review?(armenzg) → review+
Assignee | ||
Comment 7•11 years ago
|
||
Assignee | ||
Comment 8•11 years ago
|
||
Assignee | ||
Comment 9•11 years ago
|
||
The failed emulator log (comment 8) shows:
... ioctl(KVM_CREATE_VM) failed: Interrupted system call
12:32:10 INFO - ko:failed to initialize KVM
Assignee | ||
Comment 10•11 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=32051383&tree=Cedar
(looks the same as Comment 8).
Comment 11•11 years ago
|
||
in production
Assignee | ||
Updated•11 years ago
|
Keywords: intermittent-failure
Assignee | ||
Updated•11 years ago
|
Assignee | ||
Comment 12•11 years ago
|
||
I checked the permissions on /dev/kvm; for both the passing and failing cases, permissions are the same:
crw-rw----+ 1 root kvm 10, 232 Dec 18 14:05 /dev/kvm
Assignee | ||
Comment 13•11 years ago
|
||
Delaying the launch of the emulators can virtually eliminate this error. Normally I am seeing the emulator launch fail about 1 time in 10. If we sleep for 3 minutes before trying to launch the emulators, this still happens, but only ~1 time in 100. If the sleep is reduced to 1 minute, I see about 1 failure in 50. Below 1 minute, failures increase significantly.
Assignee | ||
Comment 14•11 years ago
|
||
In light of comment 13, I thought I would try retrying: If an emulator is launched but cannot be contacted for port redirection, kill any running emulators, wait for a while, and then try launching the emulators again. A pause of 30 seconds between retries and up to 3 retries seems to effectively resolve this bug.
Here is an example where we fail (with the ioctl(KVM_CREATE_VM) message), retry, and succeed:
https://tbpl.mozilla.org/php/getParsedLog.php?id=32254143&tree=Ash&full=1
Assignee | ||
Comment 15•11 years ago
|
||
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Updated•11 years ago
|
Attachment #8350759 -
Flags: review?(armenzg)
Comment 17•11 years ago
|
||
Comment on attachment 8350759 [details] [diff] [review]
(3) retry as described in Comment 14
Review of attachment 8350759 [details] [diff] [review]:
-----------------------------------------------------------------
::: scripts/android_emulator_unittest.py
@@ +471,5 @@
> self._kill_processes("compiz")
> +
> + attempts = 0
> + redirect_failed = True
> + while attempts < 3 and redirect_failed:
Mind adding a comment on this while block to help future readers know what we're doing.
Attachment #8350759 -
Flags: review?(armenzg) → review+
Assignee | ||
Comment 18•11 years ago
|
||
With block comment added:
https://hg.mozilla.org/build/mozharness/rev/7de07ccc4e04
Assignee | ||
Comment 19•11 years ago
|
||
Patch (3) is an effective solution: https://tbpl.mozilla.org/?tree=Cedar&showall=1&rev=36c8eeaf1589
Comment 20•11 years ago
|
||
Merged mozharness.
Assignee | ||
Updated•11 years ago
|
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
status-firefox27:
--- → unaffected
status-firefox28:
--- → unaffected
status-firefox29:
--- → fixed
status-firefox-esr24:
--- → unaffected
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Updated•7 years ago
|
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
Updated•5 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•