Crash in [@ mozilla::gmp::GMPChild::RecvPreloadLibs]
Categories
(Core :: Audio/Video: GMP, defect, P3)
Tracking
()
People
(Reporter: gsvelto, Unassigned)
Details
(Keywords: crash)
Crash Data
Attachments
(1 file)
Crash report: https://crash-stats.mozilla.org/report/index/8d617ac1-55fc-43df-ad1e-94c9c0210308
MOZ_CRASH Reason: MOZ_CRASH(Couldn't load lib needed by NSS)
Top 10 frames of crashing thread:
0 libxul.so mozilla::gmp::GMPChild::RecvPreloadLibs [clone .cold]
1 libxul.so mozilla::gmp::PGMPChild::OnMessageReceived build-browser/ipc/ipdl/build-browser/ipc/ipdl/PGMPChild.cpp:500
2 libxul.so mozilla::ipc::MessageChannel::DispatchAsyncMessage
3 libxul.so mozilla::ipc::MessageChannel::DispatchMessage
4 libxul.so mozilla::ipc::MessageChannel::MessageTask::Run
5 libxul.so MessageLoop::RunTask build-browser/ipc/chromium/ipc/chromium/src/base/message_loop.cc:465
6 libxul.so MessageLoop::DoWork [clone .part.0] build-browser/ipc/chromium/ipc/chromium/src/base/message_loop.cc:548
7 libxul.so base::MessagePumpDefault::Run
8 libxul.so MessageLoop::Run build-browser/ipc/chromium/ipc/chromium/src/base/message_loop.cc:309
9 libxul.so XRE_InitChildProcess
This is a Linux-specific issue, it seems that we're crashing here. Don't be fooled by the elevated volume in ESR, that's caused by the fact that we don't throttle crash report processing coming from the ESR channel. Crashes coming from release should be ~10 times higher than the numbers show.
This is the code the runs to load libs before the sandbox goes up. The sender should just be sending through the libs[0] we have on the allow list in the code we're crashing in. I.e. this code is trying to load "libfreeblpriv3.so", "libsoftokn3.so"
and is failing to do so resulting in this crash. Specifically our dlopen
call[1] is not returning a lib. I'm not sure why this would happen. We could add some handling here with dlerror
, though since that returns a string I'm not sure the best way to link it to crash reports (we could search the string for known phrases and use different MOZ_CRASHs to differentiate?).
[0] https://searchfox.org/mozilla-central/rev/526a5089c61db85d4d43eb0e46edaf1f632e853a/dom/media/gmp/GMPParent.cpp#856
[1] https://searchfox.org/mozilla-central/rev/526a5089c61db85d4d43eb0e46edaf1f632e853a/dom/media/gmp/GMPChild.cpp#233
Reporter | ||
Comment 2•3 years ago
|
||
You could add a note to the crash report using AppendAppNotesToCrashReport(). You'd need to inspect the crashes manually afterwards but it usually does the job if you don't want to add a dedicated crash annotation (and if it's something you plan on removing after you've solved the issue).
Neat, I wasn't aware of that. Will cook up a patch.
Add a temporary note to this crash path to help diagnose why lib loads are
failing.
Updated•3 years ago
|
Pushed by bvandyk@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/ac3a26e86df0 Add note to GMP lib load crash to diagnose reason. r=gsvelto
Comment 6•3 years ago
|
||
bugherder |
Comment 7•3 years ago
|
||
The leave-open keyword is there and there is no activity for 6 months.
:bryce, maybe it's time to close this bug?
libfreeblpriv3.so: cannot open shared object file: No such file or directory
Looks like our error. I assume libfreeblpriv3.so
should always be packaged and present. I don't know why it wouldn't be present. Busted install? OS security preventing GMP from seeing the file? Not sure where to start, :jld, any ideas?
Comment 9•3 years ago
|
||
The only thing I can think of is: GMP is the only remaining normal process type that still uses the plugin-container
executable instead of running firefox
with a special flag. (Bug 1114647 and related; this is where it's currently specified in the source. Originally all child processes used plugin-container
, which was confusing.) I wonder if there's some OS-level security policy that's allowing only the firefox
executable to see those libraries for some reason, although it would have to have allowed libxul.so
and a few others to be loaded before getting to this point, so that's still a little confusing. Relatedly, these are probably Mozilla's builds (no distribution ID in the telemetry environment in the crashes I looked at), so they may be downloaded into somewhere in a user's home directory, which might be part of why a security policy would block things.
I also notice that the crashes are all from Debian or Debian-based distros (and an unusually large number from Kali), but regular Debian doesn't have any problems with this, so I don't know what the connection is there.
(Incidentally, it's not clear if there was ever any technical requirement to continue using plugin-container
for plugins on Linux — I vaguely recall that it was needed for Flash on Windows for annoying reasons — so if its existence is causing problems then it's possible we could just get rid of it.)
Reporter | ||
Comment 10•3 years ago
|
||
(In reply to Jed Davis [:jld] ⟨⏰|UTC-6⟩ ⟦he/him⟧ from comment #9)
Relatedly, these are probably Mozilla's builds (no distribution ID in the telemetry environment in the crashes I looked at), so they may be downloaded into somewhere in a user's home directory, which might be part of why a security policy would block things.
I think I can find that out for you by looking at the crashes. In the past we found some crazy issues related to that, including a guy that had it in installed under /root
.
Reporter | ||
Comment 11•3 years ago
|
||
FYI I opened a bunch of crashes and Firefox seems to be installed where it's supposed to be installed by the package manager (under /usr
).
Comment 12•3 years ago
|
||
(In reply to Gabriele Svelto [:gsvelto] from comment #11)
FYI I opened a bunch of crashes and Firefox seems to be installed where it's supposed to be installed by the package manager (under
/usr
).
If it's a downstream build, then I have another idea: they typically use the distro's packages for dependencies like NSS. And, looking at Debian's libnss3
package, the internal libraries like libfreebl3.so
are in a different directory:
/usr/lib/x86_64-linux-gnu/libnss3.so
/usr/lib/x86_64-linux-gnu/libnssutil3.so
/usr/lib/x86_64-linux-gnu/libsmime3.so
/usr/lib/x86_64-linux-gnu/libssl3.so
/usr/lib/x86_64-linux-gnu/nss/libfreebl3.so
/usr/lib/x86_64-linux-gnu/nss/libfreeblpriv3.so
/usr/lib/x86_64-linux-gnu/nss/libnssckbi.so
/usr/lib/x86_64-linux-gnu/nss/libnssdbm3.so
/usr/lib/x86_64-linux-gnu/nss/libsoftokn3.so
And it looks like Debian's builds don't set distributionId
(I'd misremembered that they did).
So now I have STR: using Debian's build of Firefox, load https://cpearce.github.io/mse-eme/. Example: bp-12fcc0d1-0464-4268-a2bc-3ab930210917
Also, a workaround: LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu/nss
(adjust as needed for arch to avoid the mistake seen in bp-b90fd437-7f52-4406-bccc-9305b0210917)
What I don't know yet is how to fix this properly. If we could start NSS and get it to load the libraries (before starting the sandbox, but after we know the plugin is clearkey), that would hopefully be easier than trying to guess the paths.
Comment 13•3 years ago
|
||
I just had the same crash and I'm using Kali Linux with Firefox ESR installed from the distro's repository.
https://crash-stats.mozilla.org/report/index/932c1f77-58d2-49bc-85a7-ea4d40210923#tab-details
Comment 14•3 years ago
|
||
The library directory change appears to be part of this Debian patch to NSS. glandium, as the author of that patch, do you think we should try to change how we preload NSS for the clearkey EME plugin, or would it make more sense for Debian to carry a patch for Firefox to extend the sandbox policy, given that Debian's packaging system should know the exact library paths?
Comment 15•3 years ago
|
||
On one hand, if LD_LIBRARY_PATH works around it, there shouldn't be a need to extend the sandbox policy. Just to change the preloading in GMP*, which would be a Debian-specific thing.
On the other hand, shouldn't the preloading use NSS's ways, rather than dlopen? (especially in the light of possibly linking NSS entirely statically, which I suppose is still in the domain of possibilities some day)
Comment 16•2 years ago
|
||
The leave-open keyword is there and there is no activity for 6 months.
:bryce, maybe it's time to close this bug?
For more information, please visit auto_nag documentation.
Comment 17•2 years ago
|
||
Unassigning bugs assigned to Bryce because he no longer works at Mozilla.
Comment 18•4 months ago
|
||
Closing because no crashes reported for 12 weeks.
Updated•4 months ago
|
Description
•