Closed Bug 1850428 Opened 2 years ago Closed 2 years ago

switch from timeout command to subprocess.run timeout argument

Categories

(Socorro :: Processor, enhancement, P2)

enhancement

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: willkg, Assigned: willkg)

References

Details

Attachments

(1 file)

The Socorro processor was written a million years ago and it's possible that whatever was running the stackwalker as a sub process didn't have a timeout or it was flaky. Because of that, we run the stackwalker using the timeout gnu coreutils command to prevent it from running forever.

Nowadays, we're using subprocess.run which has a timeout argument. Under the hood, it uses threads to keep track of the sub process and timeout. Contrast that with the timeout gnu coreutils command which uses a child process.

Roughly what we have now:

import subprocess
subprocess.run(["timeout", "--signal", "KILL", "100", "sleep", "100"], capture_output=True)
$ python subprocesstest.py &
[1] 253021
$ ps -f
UID          PID    PPID  C STIME TTY          TIME CMD
willkg    244711    6089  0 17:36 pts/3    00:00:00 /bin/bash
willkg    253021  244711  0 17:43 pts/3    00:00:00 /home/willkg/.pyenv/versions/3.10.12/bin/python subprocesstest.py
willkg    253149  253021  0 17:43 pts/3    00:00:00 timeout --signal KILL 100 sleep 100
willkg    253151  253149  0 17:43 pts/3    00:00:00 sleep 100
willkg    253255  244711 99 17:43 pts/3    00:00:00 ps -f

Wildly, if you kill the python subprocesstest.py process, the sleep 100 process continues:

$ fg
python subprocesstest.py
^C
Traceback (most recent call last):
  File "/home/willkg/mozilla/socorro/subprocesstest.py", line 3, in <module>
    subprocess.run(["timeout", "--signal", "KILL", "100", "sleep", "100"], capture_output=True)
  File "/home/willkg/.pyenv/versions/3.10.12/lib/python3.10/subprocess.py", line 505, in run
    stdout, stderr = process.communicate(input, timeout=timeout)
  File "/home/willkg/.pyenv/versions/3.10.12/lib/python3.10/subprocess.py", line 1154, in communicate
    stdout, stderr = self._communicate(input, endtime, timeout)
  File "/home/willkg/.pyenv/versions/3.10.12/lib/python3.10/subprocess.py", line 2021, in _communicate
    ready = selector.select(timeout)
  File "/home/willkg/.pyenv/versions/3.10.12/lib/python3.10/selectors.py", line 416, in select
    fd_event_list = self._selector.poll(timeout)
KeyboardInterrupt
$ ps -f
UID          PID    PPID  C STIME TTY          TIME CMD
willkg    244711    6089  0 17:36 pts/3    00:00:00 /bin/bash
willkg    253151    3125  0 17:43 pts/3    00:00:00 sleep 100
willkg    253765  244711  0 17:44 pts/3    00:00:00 ps -f

What I want to switch to:

import subprocess

subprocess.run(["sleep", "100"], timeout=100, capture_output=True)
$ python subprocesstest.py &
[1] 255047
$ ps -f
UID          PID    PPID  C STIME TTY          TIME CMD
willkg    244711    6089  0 17:36 pts/3    00:00:00 /bin/bash
willkg    255047  244711  1 17:46 pts/3    00:00:00 /home/willkg/.pyenv/versions/3.10.12/bin/python subprocesstest.py
willkg    255198  255047  0 17:46 pts/3    00:00:00 sleep 100
willkg    255275  244711  0 17:46 pts/3    00:00:00 ps -f

If we kill python subprocesstest.py in this case, we don't end up with orphaned processes:

$ fg
python subprocesstest.py
^C
Traceback (most recent call last):
  File "/home/willkg/mozilla/socorro/subprocesstest.py", line 3, in <module>
    subprocess.run(["sleep", "100"], timeout=100, capture_output=True)
  File "/home/willkg/.pyenv/versions/3.10.12/lib/python3.10/subprocess.py", line 505, in run
    stdout, stderr = process.communicate(input, timeout=timeout)
  File "/home/willkg/.pyenv/versions/3.10.12/lib/python3.10/subprocess.py", line 1154, in communicate
    stdout, stderr = self._communicate(input, endtime, timeout)
  File "/home/willkg/.pyenv/versions/3.10.12/lib/python3.10/subprocess.py", line 2021, in _communicate
    ready = selector.select(timeout)
  File "/home/willkg/.pyenv/versions/3.10.12/lib/python3.10/selectors.py", line 416, in select
    fd_event_list = self._selector.poll(timeout)
KeyboardInterrupt
$ ps -f
UID          PID    PPID  C STIME TTY          TIME CMD
willkg    244711    6089  0 17:36 pts/3    00:00:00 /bin/bash
willkg    255597  244711 99 17:46 pts/3    00:00:00 ps -f

This bug covers switching from timeout gnu coreutils command to using subprocess.run with a timeout.

I deployed this to prod just now in bug #1851648. Marking as FIXED.

Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: