Capturing a Gecko profile on Linux causes SIGSEGV on Socket process
Categories
(Core :: Gecko Profiler, defect, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr68 | --- | unaffected |
firefox74 | --- | unaffected |
firefox75 | --- | unaffected |
firefox76 | --- | fixed |
People
(Reporter: brennan.brisad, Assigned: mjf)
References
(Regression)
Details
(Keywords: regression)
Attachments
(1 file)
STR:
(Only works once. I need to restart the browser each time to get a new socket process to crash.)
- To observe the SIGSEGV, run, say, exitsnoop in a separate terminal window. Passing
-x
will tell it to only trace fails.
# exitsnoop -x
- In Firefox, enable the profiler button.
- Start the profiler by clicking "Start Recording" in the profiler popup.
- Click "Capture" in the same popup.
Outcome:
The socket process segfaults.
(in the exitsnoop terminal window)
PCOMM PID PPID TID AGE(s) EXIT_CODE
BHMgr Processor 22965 22941 22975 24.53 signal 11 (SEGV)
Socket Process 22965 22941 22965 24.56 signal 11 (SEGV)
Socket Thread 22965 22941 23013 24.38 signal 11 (SEGV)
Chrome_~dThread 22965 22941 22973 24.54 signal 11 (SEGV)
SamplerThread 22965 22941 23158 0.01 signal 11 (SEGV)
BHMgr Monitor 22965 22941 22974 24.53 signal 11 (SEGV)
ProfilerChild 22965 22941 23012 24.39 signal 11 (SEGV)
$ ps -ef
UID PID PPID C STIME TTY TIME CMD
michael 22965 22941 1 21:13 pts/0 00:00:01 [Socket Process] <defunct>
Nightly will output this warning in the console:
[Socket 22965, ProfilerChild] WARNING: failed to open shm: Permission denied: file /builds/worker/checkouts/gecko/ipc/chromium/src/base/shared_memory_posix.cc, line 246
A debug build of Nightly will also emit this message.
Program /home/michael/gecko/objdir-frontend/dist/bin/firefox-bin (pid = 22965) received signal 11.
Investigation
I ran a debug build and attached GDB at the point of the crash. Here's a paste of the relevant stack frames.
#2 0x00007fec26c62087 in ah_crap_handler(int) ()
at /home/michael/gecko/objdir-frontend/dist/bin/libxul.so
#3 0x00007fec26c62184 in InstallSignalHandlers(char const*) ()
at /home/michael/gecko/objdir-frontend/dist/bin/libxul.so
#4 0x00007fec2f92a840 in <signal handler called> () at /lib/x86_64-linux-gnu/libc.so.6
#5 0x00007fec2f995256 in __memmove_sse2_unaligned_erms ()
at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:416
#6 0x00007fec26960fa2 in ChunkedJSONWriteFunc::CopyDataIntoLazilyAllocatedBuffer(std::function<char* (unsigned long)> const&) const () at /home/michael/gecko/objdir-frontend/dist/bin/libxul.so
#7 0x00007fec26971f58 in profiler_get_profile_json_into_lazily_allocated_buffer(std::function<char* (unsigned long)> const&, double, bool) () at /home/michael/gecko/objdir-frontend/dist/bin/libxul.so
When ChunkedJSONWriteFunc::CopyDataIntoLazilyAllocatedBuffer
, tries to allocate shared memory, the eventual call to shm_open
will fail with EACCES
, if it is the Socket process that is running. A null pointer is returned, and the memcpy
will cause the SIGSEGV and crash.
I git-bisected the issue and found that this started happening after landing of Bug 1608558. So it seems like the new sandboxing is the cause of the EACCES
error.
Updated•4 years ago
|
Thank you Michael for reporting this issue, and the excellent analysis.
A quick fix to stop crashing would be to test the returned pointer in ChunkedJSONWriteFunc::CopyDataIntoLazilyAllocatedBuffer
(it should be done anyway).
Of course we'd also want to get this working again in the sandbox.
Updated•4 years ago
|
Assignee | ||
Comment 2•4 years ago
|
||
I confirmed this was sandboxing related using the MOZ_DISABLE_SOCKET_PROCESS_SANDBOX. I should have a fix ready quickly.
Assignee | ||
Updated•4 years ago
|
Assignee | ||
Comment 3•4 years ago
|
||
Pushed by mfroman@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/50fbb98895f9 allow shmem in linux sandbox for socket process to support profiler. r=gcp
Comment 5•4 years ago
|
||
bugherder |
Updated•4 years ago
|
Description
•