Closed Bug 1477037 Opened Last year Closed 9 months ago

asan-nightly-project: Startup crash with layers.gpu-process.force-enabled;true on Linux

Categories

(Core :: Graphics, defect, P3)

x86_64
Linux
defect

Tracking

()

VERIFIED FIXED
mozilla65
Tracking Status
firefox63 --- disabled
firefox64 --- disabled
firefox65 --- fixed

People

(Reporter: darkspirit, Assigned: darkspirit)

References

(Blocks 1 open bug)

Details

(Keywords: nightly-community)

Attachments

(1 file)

> https://blog.mozilla.org/security/2018/07/19/introducing-the-asan-nightly-project/

Debian Testing, KDE, Xorg, GTX 1060

Since I test WebRender I also have enabled the GPU Process to have some more stability.

The GPU process supported on Windows and Linux, but is enabled by default only on Windows.
So far I was only aware of two GPU process bugs:
* bug 1406230 (GPU Process can't be used together with OOP Webextensions)
* bug 1415020 comment 8 (GPU Process + OpenGL + Multiple Windows)

ASan startup crash:

=================================================================
==13005==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x0000004343c0 bp 0x7ffdf6ac0770 sp 0x7ffdf6abfef0 T0)
==13005==The signal is caused by a READ memory access.
==13005==Hint: address points to the zero page.
    #0 0x4343bf in __interceptor_strcmp /builds/worker/workspace/moz-toolchain/src/llvm/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc
    #1 0x7f1aef022ad2 in mozilla::gfx::GPUProcessImpl::Init(int, char**) /builds/worker/workspace/build/src/gfx/ipc/GPUProcessImpl.cpp:36:9
    #2 0x7f1af7dd9edb in XRE_InitChildProcess(int, char**, XREChildData const*) /builds/worker/workspace/build/src/toolkit/xre/nsEmbedFunctions.cpp:721:21
    #3 0x4f1cf4 in content_process_main /builds/worker/workspace/build/src/browser/app/../../ipc/contentproc/plugin-container.cpp:50:30
    #4 0x4f1cf4 in main /builds/worker/workspace/build/src/browser/app/nsBrowserApp.cpp:287
    #5 0x7f1b0b104a86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21a86)
    #6 0x421128 in _start (/home/darkspirit/firefox/firefox+0x421128)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /builds/worker/workspace/moz-toolchain/src/llvm/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc in __interceptor_strcmp
==13005==ABORTING
Overkill. I shouldn't have ticked "Security", but can't remove it now. Sorry.
Unhiding per comment 1.
Group: core-security
Thanks for filing the bug, I have been seeing this as well in the ASan crash reports, wondering what it was about.

Is this reproducible in any way?
1. Create a new profile. Start it.
2. Set layers.gpu-process.force-enabled;true (and set asanreporter.clientid).
3. Close the window.
4. Click on "Launch profile in new browser".
5. This (new) profile only has a short hiccup, generates a new asan report and shows its window.

My main profile didn't want to open a window at all until I've set layers.gpu-process.force-enabled back to false.
Priority: -- → P3
I think I get basically the same for the RDD process (bug 1471535):

=================================================================
==23598==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x55ab8c08ce90 bp 0x7ffce9090a50 sp 0x7ffce90901d0 T0)
==23598==The signal is caused by a READ memory access.
==23598==Hint: address points to the zero page.
    #0 0x55ab8c08ce8f in __interceptor_strcmp /builds/worker/workspace/moz-toolchain/src/llvm/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc
    #1 0x7f1b875dc9f2 in mozilla::RDDProcessImpl::Init(int, char**) /builds/worker/workspace/build/src/dom/media/ipc/RDDProcessImpl.cpp:29:9
    #2 0x7f1b8cb83bfd in XRE_InitChildProcess(int, char**, XREChildData const*) /builds/worker/workspace/build/src/toolkit/xre/nsEmbedFunctions.cpp:761:21
    #3 0x55ab8c1533d4 in content_process_main /builds/worker/workspace/build/src/browser/app/../../ipc/contentproc/plugin-container.cpp:50:30
    #4 0x55ab8c1533d4 in main /builds/worker/workspace/build/src/browser/app/nsBrowserApp.cpp:287
    #5 0x7f1b9a7d7b16 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x22b16)
    #6 0x55ab8c078aa8 in _start (/home/darkspirit/firefox/firefox+0x29aa8)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /builds/worker/workspace/moz-toolchain/src/llvm/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc in __interceptor_strcmp
==23598==ABORTING
Is this happening by default now, or using a pref?

The crash looks weird because it dies on the strcmp call which looks perfectly valid to me.
Flags: needinfo?(jan)
> Is this happening by default now, or using a pref?
RDD is behind a pref.

I installed Asan Nightly again because I would like to provide better information about the crash
in bug 1509813 comment 1. Now I noticed that Asan Nightly doesn't spawn GPU and RDD processes at all,
even though I'm using the same profile.

$ MOZ_SANDBOX_LOGGING=1 MOZ_DISABLE_GPU_SANDBOX=1 ./firefox
*** You are running in chaos test mode. See ChaosMode.h. ***
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
[13509, Gecko_IOThread] WARNING: pipe error (39): Die Verbindung wurde vom Kommunikationspartner zurückgesetzt: file /builds/worker/workspace/build/src/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 363
[13509, Gecko_IOThread] WARNING: pipe error (3): Die Verbindung wurde vom Kommunikationspartner zurückgesetzt: file /builds/worker/workspace/build/src/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 363

(firefox:13509): Gtk-WARNING **: 19:45:59.913: Theme directory actions32@2x of theme breeze-dark has no size field


(firefox:13509): Gtk-WARNING **: 19:45:59.997: Theme parsing error: <data>:1:34: Expected ')' in color definition

(firefox:13509): Gtk-WARNING **: 19:45:59.997: Theme parsing error: <data>:1:77: Expected ')' in color definition
Crash Annotation GraphicsCriticalError: |[0][GFX1-]: Failed to connect GPU process (t=7.75768) [GFX1-]: Failed to connect GPU process
Flags: needinfo?(jan)
bug 1494956 looks like the same and they've made a patch a month ago (for a private/experimental branch):
https://phabricator.services.mozilla.com/D7292

https://searchfox.org/mozilla-central/search?q=for+%28int+i+%3D+1%3B+i+%3C+aArgc%3B+i%2B%2B%29+%7B&case=true&path=
From all of these only ContentProcess.cpp has the following check:

    if (!aArgv[i]) {
      continue;
    }

Do you think this should be added to the other files as well?
I think this is absolutely correct :) We would have to add this NULL check to RDD and GFX most likely. It would be nice to know where the NULL is actually coming from though. I don't know if it is intended, but according to bug 1494956 it seems to be expected, probably when crash reporting is disabled or something like that. In any case, those NULL checks can't hurt.

Do you want to make a patch or shall I?
I'd like to try.
Attached patch bug1477037.patchSplinter Review
Assignee: nobody → jan
Status: NEW → ASSIGNED
Attachment #9029103 - Flags: review?(choller)
I added it for the VR process as well because I've set dom.vr.process.enabled to true.
Comment on attachment 9029103 [details] [diff] [review]
bug1477037.patch

Looks good to me, forwarding review to Nathan.

Nathan, this adds a NULL check for args to the startup of some processes. The ContentProcess also has this (see https://searchfox.org/mozilla-central/rev/8f0db72fb6e35414fb9a6fc88af19c69f332425f/dom/ipc/ContentProcess.cpp#133) and it was added for the experimental SocketProcess as well in bug 1494956. This should fix crashes that occur in ASan Nightly when you enable the media process or the GPU process on Windows.
Attachment #9029103 - Flags: review?(nfroyd)
Attachment #9029103 - Flags: review?(choller)
Attachment #9029103 - Flags: feedback+
Comment on attachment 9029103 [details] [diff] [review]
bug1477037.patch

Review of attachment 9029103 [details] [diff] [review]:
-----------------------------------------------------------------

Can you file a followup bug in Core::IPC about removing these checks?  I don't see any reason that we should be passing nullptr in the arg list...unless we're processing the nullptr at the end of argv, which indicates a logic bug someplace else.
Attachment #9029103 - Flags: review?(nfroyd) → review+
See Also: → 1511647
Just in case you see such a report tomorrow:

      Build: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=427accad16ae157eebbe24f95f64812ada65304a&selectedJob=215102681
        STR: bug 1415020 comment 15
Pending fix: bug 1415020 comment 22

AddressSanitizer:DEADLYSIGNAL
=================================================================
==11679==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f255350e009 bp 0x7f2534b31213 sp 0x7f253698e300 T3)
==11679==The signal is caused by a READ memory access.
==11679==Hint: address points to the zero page.
    #0 0x7f255350e008 in XQueryExtension (/usr/lib/x86_64-linux-gnu/libX11.so.6+0x3a008)
    #1 0x7f2553501935 in XInitExtension (/usr/lib/x86_64-linux-gnu/libX11.so.6+0x2d935)
    #2 0x7f2551ffb0de in XextAddDisplay (/usr/lib/x86_64-linux-gnu/libXext.so.6+0xd0de)
    #3 0x7f2534ab3169  (/usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.0+0x83169)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/usr/lib/x86_64-linux-gnu/libX11.so.6+0x3a008) in XQueryExtension
Thread T3 (Compositor) created by T0 (GPU Process) here:
    #0 0x563a6d7d625d in __interceptor_pthread_create /builds/worker/workspace/moz-toolchain/src/llvm/projects/compiler-rt/lib/asan/asan_interceptors.cc:210:3
    #1 0x7f253e63f94c in CreateThread /builds/worker/workspace/build/src/ipc/chromium/src/base/platform_thread_posix.cc:123:14
    #2 0x7f253e63f94c in PlatformThread::Create(unsigned long, PlatformThread::Delegate*, unsigned long*) /builds/worker/workspace/build/src/ipc/chromium/src/base/platform_thread_posix.cc:134
    #3 0x7f253e64c7f3 in base::Thread::StartWithOptions(base::Thread::Options const&) /builds/worker/workspace/build/src/ipc/chromium/src/base/thread.cc:97:8
    #4 0x7f254019775e in CreateCompositorThread /builds/worker/workspace/build/src/gfx/layers/ipc/CompositorThread.cpp:92:26
    #5 0x7f254019775e in mozilla::layers::CompositorThreadHolder::CompositorThreadHolder() /builds/worker/workspace/build/src/gfx/layers/ipc/CompositorThread.cpp:45
    #6 0x7f2540197a52 in mozilla::layers::CompositorThreadHolder::Start() /builds/worker/workspace/build/src/gfx/layers/ipc/CompositorThread.cpp:113:33
    #7 0x7f2540461995 in mozilla::gfx::GPUParent::Init(int, char const*, MessageLoop*, IPC::Channel*) /builds/worker/workspace/build/src/gfx/ipc/GPUParent.cpp:124:3
    #8 0x7f2549940e79 in XRE_InitChildProcess(int, char**, XREChildData const*) /builds/worker/workspace/build/src/toolkit/xre/nsEmbedFunctions.cpp:726:21
    #9 0x563a6d8203c4 in content_process_main /builds/worker/workspace/build/src/browser/app/../../ipc/contentproc/plugin-container.cpp:49:28
    #10 0x563a6d8203c4 in main /builds/worker/workspace/build/src/browser/app/nsBrowserApp.cpp:265
    #11 0x7f255753cb16 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x22b16)

==11679==ABORTING
Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=2.33575) [GFX1-]: Receive IPC close with reason=AbnormalShutdown
https://hg.mozilla.org/mozilla-central/rev/427accad16ae
Status: ASSIGNED → RESOLVED
Closed: 9 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla65
GPU and RDD processes now work with ASan Nightly on Linux.
Status: RESOLVED → VERIFIED
Duplicate of this bug: 1509813
(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #19)
> GPU and RDD processes now work with ASan Nightly on Linux.

Thanks for fixing this!
You need to log in before you can comment on or make changes to this bug.