Closed
Bug 961042
Opened 10 years ago
Closed 10 years ago
b2g_build.py checkout_sources() should attempt |repo sync| more than once & output a TBPL compatible failure message
Categories
(Release Engineering :: General, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: emorley, Unassigned)
References
(Blocks 1 open bug)
Details
(Keywords: sheriffing-P1)
Attachments
(1 file)
2.15 KB,
patch
|
catlee
:
review+
catlee
:
checked-in+
|
Details | Diff | Splinter Review |
In order to save the full log having to be opened (and to differentiate between the various buildbot "command timed out: 1200 seconds without output, attempting to kill" failures), we should: 1) Attempt |repo sync| more than once, so temporary network glitches are less likely to cause job failures. 2) Add a TBPL compatible failure message (eg: "Automation Error: Repo sync failed ..."). Happy to defer #1 to another bug if needed. Current failures are of the form: b2g_b2g-inbound_nexus-4_dep https://tbpl.mozilla.org/php/getParsedLog.php?id=33144750&tree=B2g-Inbound { 19:17:46 INFO - Running command: ['script', '-q', '-c', '/builds/slave/b2g_b2g-in_nexus-4_dep-0000000/build/repo sync'] in /builds/slave/b2g_b2g-in_nexus-4_dep-0000000/build 19:17:46 INFO - Copy/paste: script -q -c "/builds/slave/b2g_b2g-in_nexus-4_dep-0000000/build/repo sync" 19:17:46 INFO - Fetching project fake-libdvm 19:17:46 INFO - Fetching project device/generic/armv7-a-neon 19:17:46 INFO - Fetching project device-mako 19:17:46 INFO - Fetching project device/lge/mako-kernel ... ... 19:24:47 INFO - Fetching projects: 94% (123/130) Fetching project gonk-misc 19:24:48 INFO - Receiving objects: 88% (2596/2927), 51.30 MiB | 116 KiB/s 19:24:48 INFO - 19:24:48 INFO - Fetching projects: 95% (124/130) Fetching project platform_build 19:24:49 INFO - Receiving objects: 88% (2596/2927), 51.44 MiB | 118 KiB/s 19:24:49 INFO - 19:24:49 INFO - Fetching projects: 96% (125/130) Fetching project moztt 19:24:50 INFO - Fetching project rilproxy command timed out: 3600 seconds without output, attempting to kill process killed by signal 9 program finished with exit code -1 elapsedTime=4219.647320 ========= Finished 'scripts/scripts/b2g_build.py --target ...' failed (results: 2, elapsed: 1 hrs, 10 mins, 19 secs) (at 2014-01-16 20:24:50.839430) ========= } As far as I can tell, the relevant code is at: http://hg.mozilla.org/build/mozharness/file/3f764317c8db/scripts/b2g_build.py#l527
Comment 1•10 years ago
|
||
This is full of suck. We have to run 'repo' inside of a tool called 'script' to work around bug 857158 and not have git clones permafail. It turns out that 'script' always exits with 0, so aside from log parsing, we have no way to know if 'repo sync' succeeded. I've been experimenting with tmux as a wrapper to provide a pty to repo instead. It seems to return a proper exit code at least!
Reporter | ||
Comment 2•10 years ago
|
||
(In reply to Chris AtLee [:catlee] from comment #1) > This is full of suck. > > We have to run 'repo' inside of a tool called 'script' to work around bug > 857158 and not have git clones permafail. It turns out that 'script' always > exits with 0, so aside from log parsing, we have no way to know if 'repo > sync' succeeded. > > I've been experimenting with tmux as a wrapper to provide a pty to repo > instead. It seems to return a proper exit code at least! Hi Chris - I don't suppose there's a bug filed for this work (or is it this one?) - and have you had any luck with it? :-)
Flags: needinfo?(catlee)
Comment 3•10 years ago
|
||
Actually, I think the bulk of this was fixed as part of bug 970918. We're not running inside script any more, and we are retrying. What's left?
Flags: needinfo?(catlee)
Reporter | ||
Comment 4•10 years ago
|
||
Ah great to know :-) The only thing left is that it looks like we're not retrying, from the latest logs I see in bug 873928? eg: https://tbpl.mozilla.org/php/getParsedLog.php?id=37522885&tree=B2g-Inbound Thanks :-)
Depends on: 970918
Comment 5•10 years ago
|
||
That looks like a buildbot timeout
Comment 6•10 years ago
|
||
The buildbot timeout is set to 3600 seconds right now, so let's set the timeout for config.sh to 2700 (45 minutes).
Updated•10 years ago
|
Attachment #8411965 -
Flags: checked-in+
Comment 8•10 years ago
|
||
mozharness patch is in production: http://hg.mozilla.org/build/mozharness/rev/9accabdd4358 :)
Updated•10 years ago
|
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•6 years ago
|
Component: General Automation → General
You need to log in
before you can comment on or make changes to this bug.
Description
•