Add the build step or else process name to buildbot's generic command timed out failure strings



Release Engineering
General Automation
3 years ago
3 years ago


(Reporter: emorley, Assigned: emorley)


(Depends on: 1 bug, Blocks: 1 bug, {sheriffing-P1})

Firefox Tracking Flags

(Not tracked)



(1 attachment)

Bug 778688 comment 38 covers a number of intermittent failures where we have the generic log output:
"command timed out: N seconds without output, attempting to kill"

Whilst we've added this to the TBPL regexes, so we can use TBPL's bug suggestion feature - the messages are generic, so many suggestions are shown, eg:

Whilst I'd prefer the worst of these failure modes to be handled by the mozharness/test harness/... itself, we're always going to have edge cases where timeouts occur and it's not worth adding TBPL-compatible failure messages to that script.

As such, I was thinking we should prefix the timeout messages with the build step name or else the process name (former preferred) here:
   657     def doTimeout(self):
   658         self.timer = None
   659         msg = "command timed out: %d seconds without output" % self.timeout
   660         self.kill(msg)
   662     def doMaxTimeout(self):
   663         self.maxTimer = None
   664         msg = "command timed out: %d seconds elapsed" % self.maxTime
   665         self.kill(msg)

Now I know buildbot patches are generally a bit more awkward - so don't know if you think we would need to upstream first - or even whether they'd take it the change?

Dustin, what do you think? :-)
Blocks: 778688
I'd like to see that upstream, sure.

Shipping a change to non-Windows systems is pretty easy - it's done with Puppet.  Windows is still hard.
Upstream PR:
Assignee: nobody → emorley
Duplicate of this bug: 778690
Created attachment 8408096 [details] [diff] [review]
For timeouts include the command being run in the failure string

Backport of upstream commit:

I've checked that we won't break any of the current regex:
Attachment #8408096 - Flags: review?(dustin)
Comment on attachment 8408096 [details] [diff] [review]
For timeouts include the command being run in the failure string

Assuming you're confident that fake_command works the same way in 0.8.2, this looks just like the patch I merged :)
Attachment #8408096 - Flags: review?(dustin) → review+
Landed on default & transplanted to production-0.8, since there are buildbot master changes that require a restart that did not want to be merged across just yet.
Depends on: 1009584
Whiteboard: [waiting on bug 1009584]
This is still waiting for bug 1009584 to actually be deployed, but closing this so it still appearing in bugzilla-todos.
Last Resolved: 3 years ago
Resolution: --- → FIXED
Whiteboard: [waiting on bug 1009584] → [waiting on bug 1009584 for deployment]
This is deployed on !Windows; bug 1042597 will take care of Windows.

@Sheriffs: Note this bug changes "command timed out: 2400 seconds without output" to "command timed out: 2400 seconds without output running <cmd...>"
Depends on: 1042597
Whiteboard: [waiting on bug 1009584 for deployment]
You need to log in before you can comment on or make changes to this bug.