Closed Bug 1626385 Opened 4 years ago Closed 4 years ago

Capturing a Gecko profile on Linux causes SIGSEGV on Socket process

Categories

(Core :: Gecko Profiler, defect, P1)

Unspecified
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla76
Tracking Status
firefox-esr68 --- unaffected
firefox74 --- unaffected
firefox75 --- unaffected
firefox76 --- fixed

People

(Reporter: brennan.brisad, Assigned: mjf)

References

(Regression)

Details

(Keywords: regression)

Attachments

(1 file)

STR:

(Only works once. I need to restart the browser each time to get a new socket process to crash.)

  • To observe the SIGSEGV, run, say, exitsnoop in a separate terminal window. Passing-x will tell it to only trace fails.
# exitsnoop -x
  • In Firefox, enable the profiler button.
  • Start the profiler by clicking "Start Recording" in the profiler popup.
  • Click "Capture" in the same popup.

Outcome:

The socket process segfaults.

(in the exitsnoop terminal window)
PCOMM            PID    PPID   TID    AGE(s)  EXIT_CODE 
BHMgr Processor  22965  22941  22975  24.53   signal 11 (SEGV)
Socket Process   22965  22941  22965  24.56   signal 11 (SEGV)
Socket Thread    22965  22941  23013  24.38   signal 11 (SEGV)
Chrome_~dThread  22965  22941  22973  24.54   signal 11 (SEGV)
SamplerThread    22965  22941  23158  0.01    signal 11 (SEGV)
BHMgr Monitor    22965  22941  22974  24.53   signal 11 (SEGV)
ProfilerChild    22965  22941  23012  24.39   signal 11 (SEGV)
$ ps -ef
UID        PID  PPID  C STIME TTY          TIME CMD
michael  22965 22941  1 21:13 pts/0    00:00:01 [Socket Process] <defunct>

Nightly will output this warning in the console:

[Socket 22965, ProfilerChild] WARNING: failed to open shm: Permission denied: file /builds/worker/checkouts/gecko/ipc/chromium/src/base/shared_memory_posix.cc, line 246

A debug build of Nightly will also emit this message.

Program /home/michael/gecko/objdir-frontend/dist/bin/firefox-bin (pid = 22965) received signal 11.

Investigation

I ran a debug build and attached GDB at the point of the crash. Here's a paste of the relevant stack frames.

#2  0x00007fec26c62087 in ah_crap_handler(int) ()
    at /home/michael/gecko/objdir-frontend/dist/bin/libxul.so
#3  0x00007fec26c62184 in InstallSignalHandlers(char const*) ()
    at /home/michael/gecko/objdir-frontend/dist/bin/libxul.so
#4  0x00007fec2f92a840 in <signal handler called> () at /lib/x86_64-linux-gnu/libc.so.6
#5  0x00007fec2f995256 in __memmove_sse2_unaligned_erms ()
    at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:416
#6  0x00007fec26960fa2 in ChunkedJSONWriteFunc::CopyDataIntoLazilyAllocatedBuffer(std::function<char* (unsigned long)> const&) const () at /home/michael/gecko/objdir-frontend/dist/bin/libxul.so
#7  0x00007fec26971f58 in profiler_get_profile_json_into_lazily_allocated_buffer(std::function<char* (unsigned long)> const&, double, bool) () at /home/michael/gecko/objdir-frontend/dist/bin/libxul.so

When ChunkedJSONWriteFunc::CopyDataIntoLazilyAllocatedBuffer, tries to allocate shared memory, the eventual call to shm_open will fail with EACCES, if it is the Socket process that is running. A null pointer is returned, and the memcpy will cause the SIGSEGV and crash.

I git-bisected the issue and found that this started happening after landing of Bug 1608558. So it seems like the new sandboxing is the cause of the EACCES error.

Regressed by: 1608558
Has Regression Range: --- → yes

Thank you Michael for reporting this issue, and the excellent analysis.

A quick fix to stop crashing would be to test the returned pointer in ChunkedJSONWriteFunc::CopyDataIntoLazilyAllocatedBuffer (it should be done anyway).
Of course we'd also want to get this working again in the sandbox.

Priority: -- → P1

I confirmed this was sandboxing related using the MOZ_DISABLE_SOCKET_PROCESS_SANDBOX. I should have a fix ready quickly.

Assignee: nobody → mfroman
Pushed by mfroman@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/50fbb98895f9
allow shmem in linux sandbox for socket process to support profiler. r=gcp
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla76
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: