Closed Bug 934890 Opened 11 years ago Closed 9 years ago

Intermittent "command timed out: 14400 seconds elapsed, attempting to kill" during B2G emulator or device image builds

Categories

(Release Engineering :: General, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: nigelb, Unassigned)

References

()

Details

(Keywords: intermittent-failure, Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/2311] )

Attachments

(1 file)

https://tbpl.mozilla.org/php/getParsedLog.php?id=30124227&tree=Mozilla-Inbound

b2g_mozilla-inbound_emulator_dep on 2013-11-04 19:48:12 PST for push 040e85d18eab

slave: bld-linux64-ec2-321

23:48:35  WARNING -  ../../../../gecko/netwerk/protocol/rtsp/rtsp/APacketSource.cpp:277: warning: unused variable 'data'
23:48:35     INFO -  ARTPConnection.o
23:48:36     INFO -  sctp_hashdriver.o
23:48:36     INFO -  nsMemoryCacheDevice.o
23:48:36     INFO -  CacheStorageService.o
23:48:36     INFO -  ARTPSource.o
23:48:37     INFO -  sctp_indata.o
23:48:37     INFO -  sctp_input.o
23:48:37     INFO -  OldWrappers.o
23:48:37     INFO -  sctp_output.o
23:48:38     INFO -  nsWifiAccessPoint.o
23:48:38     INFO -  nsWifiMonitorGonk.o
23:48:39     INFO -  ARTPWriter.o
23:48:39     INFO -  nsNetModule.o
23:48:40     INFO -  PropertiesTest.o
23:48:40     INFO -  ReadNTLM.o
23:48:40     INFO -  ARTSPConnection.o
23:48:41     INFO -  nsInputStreamPump.o
23:48:41     INFO -  ASessionDescription.o
23:48:42     INFO -  HttpInfo.o
23:48:42     INFO -  TestBlockingSocket.o
23:48:42     INFO -  nsHttp.o

command timed out: 14400 seconds elapsed, attempting to kill
process killed by signal 9
program finished with exit code -1
elapsedTime=14400.026401
========= Finished 'scripts/scripts/b2g_build.py --target ...' failed (results: 2, elapsed: 4 hrs, 0 secs) (at 2013-11-04 23:48:43.536247) =========
Summary: Intermittent "command timed out: 14400 seconds elapsed, attempting to kill" during B2G ICS Emulator Opt Build → Intermittent "command timed out: 14400 seconds elapsed, attempting to kill" during B2G Emulator Build
Any ideas on this one, catlee?
Summary: Intermittent "command timed out: 14400 seconds elapsed, attempting to kill" during B2G Emulator Build → Intermittent "command timed out: 14400 seconds elapsed, attempting to kill" during B2G emulator or device image builds
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #273)
> Any ideas on this one, catlee?
Flags: needinfo?(catlee)
So I'm looking at what's going on on bld-linux64-ix-036.

It's got a pretty old version of 'repo' installed (v1.12.3.1 from September 2013; latest is v1.12.16)

Running config.sh takes well over 45 minutes (65m38s for me), and is spending all of its time in 'git fetch'. None of the processes appear to be hung or wedged in any way, they're just slow. Upgrading to v1.12.16 allows config.sh to finish in 50s.

We're only maxing out at 2.5 MB/s though on the network.

Bug 1007259 should address the old 'repo' version.

Frequency counts per slave, with associated repo versions:
bld-linux64-spot-303 1 v1.12.13
bld-linux64-spot-139 1 unknown
bld-linux64-spot-388 1 unknown
bld-linux64-spot-342 2 v1.12.13
bld-linux64-spot-436 2 v1.12.16
bld-linux64-spot-491 2 v1.12.16
bld-linux64-spot-065 2 v1.12.16
bld-linux64-spot-102 2 v1.12.13
bld-linux64-spot-103 2 unknown
bld-linux64-spot-189 2 v1.12.13
bld-linux64-spot-180 2 v1.12.16
bld-linux64-spot-181 2 v1.12.13
bld-linux64-spot-182 2 v1.12.13
bld-linux64-spot-325 2 v1.12.13
bld-linux64-spot-455 2 v1.12.13
bld-linux64-spot-052 2 v1.12.16
bld-linux64-spot-458 2 v1.12.13
bld-linux64-spot-111 2 v1.12.13
bld-linux64-ix-031 2 v1.12.16
bld-linux64-spot-443 2 v1.12.16
bld-linux64-spot-444 2 v1.12.16
bld-linux64-spot-045 2 v1.12.16
bld-linux64-spot-043 2 unknown
bld-linux64-spot-126 2 v1.12.16
bld-linux64-spot-353 2 v1.12.13
bld-linux64-spot-143 3 v1.12.13
bld-linux64-spot-092 3 v1.12.16
bld-linux64-spot-394 3 v1.12.16
bld-linux64-spot-412 3 v1.12.16
bld-centos6-hp-025 3 v1.12.3.1
bld-linux64-spot-374 3 v1.12.13
bld-linux64-spot-348 3 v1.12.16
bld-linux64-spot-061 3 v1.12.16
bld-linux64-spot-055 3 unknown
bld-linux64-spot-118 3 v1.12.13
bld-linux64-spot-449 3 v1.12.16
bld-linux64-spot-120 3 v1.12.16
bld-linux64-spot-391 4 v1.12.16
bld-linux64-spot-484 4 v1.12.16
bld-linux64-spot-403 4 v1.12.16
bld-linux64-spot-076 4 v1.12.16
bld-linux64-spot-473 4 v1.12.16
bld-linux64-spot-467 4 v1.12.16
bld-linux64-spot-321 4 v1.12.16
bld-linux64-spot-323 4 v1.12.16
bld-linux64-spot-497 5 v1.12.13
bld-centos6-hp-016 6 v1.12.16
bld-linux64-ix-029 7 v1.12.16
bld-linux64-ix-037 8 v1.12.3.1
bld-centos6-hp-008 11 v1.12.4
bld-centos6-hp-012 14 v1.12.3.1
bld-linux64-ix-030 16 v1.12.3.1
bld-centos6-hp-018 17 v1.12.3.1
bld-linux64-ix-035 18 v1.12.3.1
bld-linux64-ix-036 18 v1.12.3.1
bld-centos6-hp-013 20 v1.12.3.1
bld-centos6-hp-009 20 v1.12.3.1
bld-linux64-ix-034 20 v1.12.3.1
bld-linux64-ix-033 21 v1.12.16
bld-linux64-ix-032 21 v1.12.3.1
bld-centos6-hp-019 27 v1.12.3.1
bld-centos6-hp-015 27 v1.12.3.1
bld-centos6-hp-006 27 v1.12.3.1
bld-linux64-ix-028 28 v1.12.3.1
bld-centos6-hp-007 37 v1.12.3.1

I'm going to go through and update 'repo' on machines with older versions.
Depends on: 1007259
Flags: needinfo?(catlee)
(In reply to Chris AtLee [:catlee] from comment #810)
> Running config.sh takes well over 45 minutes (65m38s for me), and is
> spending all of its time in 'git fetch'. None of the processes appear to be
> hung or wedged in any way, they're just slow. Upgrading to v1.12.16 allows
> config.sh to finish in 50s.

Wow - quite a difference! Nice one :-)
6 hours should be enough for anyone!
Attachment #8425796 - Flags: review?(aki)
Comment on attachment 8425796 [details] [diff] [review]
bump max time to 6 hours for emulator builds

a) yuck,
b) 6 hours + result is better than 4 hours + timed out.
Attachment #8425796 - Flags: review?(aki) → review+
Comment on attachment 8425796 [details] [diff] [review]
bump max time to 6 hours for emulator builds

yeah, sucks :(
Attachment #8425796 - Flags: checked-in+
Something here went live today
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/2308]
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/2308] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/2311]
Inactive; closing (see bug 1180138).
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: