retry.py didn't actually kill process tree for a timed-out pushsnip

RESOLVED WONTFIX

Status

Release Engineering
Release Automation: Other
P2
normal
RESOLVED WONTFIX
7 years ago
4 years ago

People

(Reporter: nthomas, Assigned: nthomas)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments)

(Assignee)

Description

7 years ago
Created attachment 557074 [details]
Process list

In bug 683412 we timed out a pushsnip, but on inspection of the processes list on aus2-staging there was sync to Phoenix still running. This was actually advantageous, but I bet it wasn't expected.

Updated

7 years ago
OS: Mac OS X → All
Priority: -- → P3
Hardware: x86 → All
(Assignee)

Comment 1

7 years ago
Hit this again today. Based on http://superuser.com/questions/20679/why-does-my-remote-process-still-run-after-killing-an-ssh-session, we should give the -t argument to ssh.
Assignee: nobody → nrthomas
Priority: P3 → P2
(Assignee)

Comment 2

7 years ago
Created attachment 591341 [details] [diff] [review]
[buildbotcustom] use -t with ssh, bump timeout for pushsnip

For 10.0b6 the pushsnip of the test snippets failed on all five of retry attempts, often while pushing to PHX. This should help by not leaving running processes slowing down NFS (ever moar!), and by setting the pushsnip timeout to the same 2 hours that backupsnip gets.
Attachment #591341 - Flags: review?(rail)
Attachment #591341 - Flags: review?(rail) → review+
(Assignee)

Comment 3

7 years ago
Comment on attachment 591341 [details] [diff] [review]
[buildbotcustom] use -t with ssh, bump timeout for pushsnip

http://hg.mozilla.org/build/buildbotcustom/rev/83f17929c032
Attachment #591341 - Flags: checked-in+
This landed in production today.
(Assignee)

Comment 5

7 years ago
May not be working, from the 3.6.26 build2 log:
  Pseudo-terminal will not be allocated because stdin is not a terminal.
which could be fallout from having 
  using PTY: False
when making the call from buildbot.
(Assignee)

Comment 6

7 years ago
It does look like the processes are being cleaned up properly though. Needs more investigation.
(In reply to Nick Thomas [:nthomas] from comment #5)
> May not be working, from the 3.6.26 build2 log:
>   Pseudo-terminal will not be allocated because stdin is not a terminal.
> which could be fallout from having 
>   using PTY: False
> when making the call from buildbot.

Likely we need (or want) to add |usePTY=True| to the RetryingShellCommand step specifically.

The default of the steps is of course to use whatever the slave is configured as (and we currently configure it to "do not use a PTY") each ShellCommand-based step can be setup to override this behavior with that arg.
Mass move of bugs to Release Automation component.
Component: Release Engineering → Release Engineering: Automation (Release Automation)
Flags: checked-in+
No longer blocks: 714371
(Assignee)

Comment 9

5 years ago
Probably needs to be ssh -tt based on the man page:

-t      Force pseudo-tty allocation.  This can be used to execute arbitrary screen-based programs on a remote machine, which can be very useful, e.g. when implementing menu services.  Multiple -t options force tty allocation, even if ssh has no local tty.
Product: mozilla.org → Release Engineering
pushsnip is going away in the forseeable future, probably not going to fix this
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.