Closed Bug 1778052 Opened 3 months ago Closed 2 months ago

All tabs crash when the terminal (emulator) firefox was run from is closed, even if firefox was run with nohup

Categories

(Core :: Security: Process Sandboxing, defect, P2)

Firefox 91
x86_64
Linux
defect

Tracking

()

VERIFIED FIXED
105 Branch
Tracking Status
firefox105 --- verified

People

(Reporter: chrishen, Assigned: jld)

References

Details

Attachments

(1 file)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0

Steps to reproduce:

  1. Open an xterm (most likely works for any terminal emulator, but this is the one I have confirmed it with).
  2. Type nohup firefox & and hit Enter
  3. [firefox opens]
  4. Open any webpage(s) in one or more tabs.
  5. Close the xterm used for step (2).

Actual results:

All open tabs crash.

Expected results:

Nothing; a program run with nohup should ignore SIGHUP, and therefore should not react to the terminal (emulator) that it was run from exiting.

The Bugbug bot thinks this bug should belong to the 'Core::Widget: Gtk' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → Widget: Gtk
Product: Firefox → Core

I'm not sure which of Core: IPC, Core: Dom: Content Processes, or Firefox: Tabbed Browser this belongs in (I suspect the real answer depends on the underlying cause), but I'm fairly confident that it doesn't belong in Core: Widget: Gtk. I ended up picking Core: IPC as much out of a sense that people watching that component are most likely to have insight on why Firefox reacts the way it does to SIGHUP as out of confidence that that's the theoretically correct choice.

I see that my initial report wasn't unambiguous, so I want to be clear: When this happens, neither windows nor tabs close; rather, every tab's content is replaced by the "Gah. Your tab just crashed" dialogue.

I just did a further test, and, after all the steps above were taken and I confirmed that the parent process (the one that just shows up as firefox-esr in ps -eF) had been reparented to pid 1:

  • sending SIGHUP to one of its child processes (e.g., kill -SIGHUP 139930, where 139930 is one of the processes with a ps -eF listing that looks like [...] /usr/lib/firefox-esr/firefox-esr -contentproc -childID [NN] -isForBrowser -prefsLen [NNNN] -prefMapSize [NNNNNN] -jsInit [NNNNNN] -parentBuildID [date, followed by addition digits] -appdir /usr/lib/firefox-esr/browser [parent process id] true tab) will crash one or more tabs (as above) and
  • sending SIGHUP to all of the child process (i.e., killall -SIGHUP /usr/lib/firefox-esr/firefox-esr) will crash all of them, but
  • sending SIGHUP to just the parent process (again) does not (at least, consistently) crash any tabs.
Component: Widget: Gtk → IPC
OS: Unspecified → Linux
Hardware: Unspecified → x86_64

The cause of this is actually at this point in the sandboxing code, which resets all signal handlers. I wasn't thinking about SIG_IGN being set by something external and inherited across exec when I wrote that. What it should probably do is reset them to SIG_DFL only if they weren't previously SIG_IGN, like what the kernel does during exec (and see also this code in nsProfileLock which installs a signal handler with the same exception).

As a workaround, using setsid (and output redirection if needed) in place of nohup should avoid this problem.

Assignee: nobody → jld
Severity: -- → S3
Status: UNCONFIRMED → NEW
Component: IPC → Security: Process Sandboxing
Ever confirmed: true
Priority: -- → P2

We uninstall signal handlers in child processes after clone(), because
they probably won't do the right thing if invoked in that context.
However, the current code also resets signals which were ignored;
if that disposition was set by an outside program like nohup, the
expectation is that it should be inherited. This patch omits those
signals when resetting handlers (similar to what exec does).

Pushed by jedavis@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/1e4c845297d7
Don't reset ignored signals when starting a sandboxed child process on Linux. r=gcp

So the problem here is:

  • Signals 32 and 33 are reserved by glibc (I think?) and attempts to touch them fail with EINVAL
  • That caused log messages using strerror, which is async signal unsafe: bug 1779312
  • The child process would deadlock, which caused the test suite to fail (especially on debug builds, where the parent process waits during shutdown for child processes to exit)

I'd forgotten about the first part of this, and the second one is something I should've fixed a long time ago but the second best time to fix it is now….

Depends on: 1779312
Flags: needinfo?(jld)

There's a r+ patch which didn't land and no activity in this bug for 2 weeks.
:jld, could you have a look please?
If you still have some work to do, you can add an action "Plan Changes" in Phabricator.
For more information, please visit auto_nag documentation.

Flags: needinfo?(jld)
Flags: needinfo?(gpascutto)
Flags: needinfo?(jld)
Flags: needinfo?(gpascutto)
Pushed by jedavis@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/8e1a65ad0c4d
Don't reset ignored signals when starting a sandboxed child process on Linux. r=gcp
Status: NEW → RESOLVED
Closed: 2 months ago
Resolution: --- → FIXED
Target Milestone: --- → 105 Branch
Flags: qe-verify+

Reproduced the tab crashes using the steps from comment 0 using an old Nightly from 2022-07-05, verified that using the latest Beta build 105.0b4 (on Ubuntu 18.04) the tabs don't crash anymore when closing xterm.

Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.