Build failure with musl libc - undefined reference to `getcontext'
Categories
(Core :: Gecko Profiler, defect, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox76 | --- | fixed |
People
(Reporter: mforney, Assigned: mforney)
References
Details
Attachments
(3 files, 1 obsolete file)
User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:73.0) Gecko/20100101 Firefox/73.0
Steps to reproduce:
Attempt to build firefox against musl libc.
Actual results:
There was a linking error, since musl does not provide the getcontext function:
/bin/ld: ../../../tools/profiler/Unified_cpp_tools_profiler0.o: in function `Registers::SyncPopulate()':
/src/pkgsrc/www/firefox/work/firefox-73.0/tools/profiler/core/platform-linux-android.cpp:515: undefined reference to `getcontext'
Expected results:
The build should have succeeded.
One idea for a solution is to add a configure check for the getcontext symbol, and change the HAVE_NATIVE_UNWIND definition for Linux to be conditional on HAVE_GETCONTEXT. What implications would this have? My worry is that this might be too big a hammer for this issue, since the off-thread sampling via SIGPROF does work with musl, only the synchronous SyncPopulate does not.
Another idea is to use breakpad_getcontext as a fallback on systems that don't have getcontext, similar to what is done for Android.
If anyone has ideas about the right way to solve this, please let me know and I can work on a patch.
it seems as if gentoo uses this patch to solve the problem:
From: Jory A. Pratt <anarchy@gentoo.org>
getcontext is only avaliable on glibc systems
--- a/tools/profiler/core/platform-linux-android.cpp
+++ b/tools/profiler/core/platform-linux-android.cpp
@@ -497,8 +497,10 @@ static void PlatformInit(PSLockRef aLock) {}
ucontext_t sSyncUContext;
void Registers::SyncPopulate() {
+#if defined(GLIBC)
if (!getcontext(&sSyncUContext)) {
PopulateRegsFromContext(*this, &sSyncUContext);
}
+#endif
}
#endif
maybe you can contact the author via his gentoo email adress for further ideas on how to fix this?
Comment 2•5 years ago
|
||
The priority flag is not set for this bug.
:gerald, could you have a look please?
For more information, please visit auto_nag documentation.
The breakpad_getcontext
option sounds good to me, as we do use breakpad for a few things already.
Otherwise, maybe a combination of comment 0 and comment 1 would work best, i.e.: Add a configure check defining HAVE_GETCONTEXT
, and use that (instead of GLIBC
) around the getcontext()
call.
(Not a Linux user here, so I may be very wrong!)
Markus, it's more your domain, any thoughts please?
Comment 4•5 years ago
|
||
All these solutions sound fine to me, I don't have an opinion on which one should be taken.
Updated•5 years ago
|
So I've been looking at getting other breakpad musl build failures fixed (https://bugs.chromium.org/p/google-breakpad/issues/detail?id=631), and I discovered that the code in breakpad_getcontext
to save the FP registers is not correct for non-Android platforms.
The profiler does not use the FP registers (see PopulateRegsFromContext
), so it doesn't matter for this purpose, but to fix the issue in general for the crash reporter and the profiler, it is probably best to find a fix for breakpad, and then adapt the solution here.
So maybe when getcontext
is not available, we should just disable the profiler or HAVE_NATIVE_UNWIND
, or make SyncPopulate
a no-op as in the Gentoo patch. Does anyone know what the implications are of making SyncPopulate
a no-op? Does the profiler still mostly function with some missing feature, or would it just be broken?
Right now, the main issue is that the profiler can't be built on musl libc, and it can't be disabled by a configure switch. I don't think most musl distributions care about firefox missing profiler functionality, so one of these other options might be the right fix in the mean time. When the issue is solved in breakpad, we can revisit this.
Updated•5 years ago
|
Mike, can you comment about the effects of disabling SyncPopulate, as in to the Gentoo patch above?
I've made some changes to upstream breakpad to use breakpad_getcontext on other Linux systems that lack getcontext:
https://chromium.googlesource.com/breakpad/breakpad/+/e780d58fd7b6eda76065327a9921d6157e4a7964%5E%21/
So I believe all we need to do is to update breakpad to the latest master version.
Assignee | ||
Comment 10•5 years ago
|
||
This includes several fixes required to build against musl libc.
Conflicts were resolved in 00-arm-exidx-rollup.patch and
10-json-upload.patch. The others applied cleanly.
breakpad_getcontext.S is now built conditionally based upon the
available of getcontext() from libc, rather than only on Android.
The profiler was updated to reflect this change.
Comment 11•5 years ago
|
||
Thanks for tackling this Michael. I've been postponing breakpad updates because of bug 1588742 but since that's going to take a few months I'm more than happy to take this patch now. It will take me at least a couple of days to review it because I have to test it across all our platforms to ensure that the crash reports we get are what we expect. In the past we had some breakage with new breakpad versions whereas the crashes were either missing modules or caused some of our automation machinery to fail and the only way to be sure is slow, manual testing.
Oh sorry, I didn't notice the NI. Clearing it, as I see mforney updated breakpad instead.
But please ping me again if you still have questions (though :mstange would know better about your question in comment 7.)
Comment 13•5 years ago
|
||
This rolls up all the relevant client changes in the patches in this update and manually applies them to the breakpad-client directory. It needs to be included in the patch.
Updated•5 years ago
|
Updated•5 years ago
|
Comment 14•5 years ago
|
||
This patch fixes the breakage in the uploader on Windows. It applies on top of the existing patch and adjusts 09-json-upload.patch to accordingly.
Comment 15•5 years ago
|
||
Comment 16•5 years ago
|
||
bugherder |
Description
•