Closed Bug 1615713 Opened 4 years ago Closed 4 years ago

Build failure with musl libc - undefined reference to `getcontext'

Categories

(Core :: Gecko Profiler, defect, P2)

73 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla76
Tracking Status
firefox76 --- fixed

People

(Reporter: mforney, Assigned: mforney)

References

Details

Attachments

(3 files, 1 obsolete file)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:73.0) Gecko/20100101 Firefox/73.0

Steps to reproduce:

Attempt to build firefox against musl libc.

Actual results:

There was a linking error, since musl does not provide the getcontext function:

/bin/ld: ../../../tools/profiler/Unified_cpp_tools_profiler0.o: in function `Registers::SyncPopulate()':
/src/pkgsrc/www/firefox/work/firefox-73.0/tools/profiler/core/platform-linux-android.cpp:515: undefined reference to `getcontext'

Expected results:

The build should have succeeded.

One idea for a solution is to add a configure check for the getcontext symbol, and change the HAVE_NATIVE_UNWIND definition for Linux to be conditional on HAVE_GETCONTEXT. What implications would this have? My worry is that this might be too big a hammer for this issue, since the off-thread sampling via SIGPROF does work with musl, only the synchronous SyncPopulate does not.

Another idea is to use breakpad_getcontext as a fallback on systems that don't have getcontext, similar to what is done for Android.

If anyone has ideas about the right way to solve this, please let me know and I can work on a patch.

it seems as if gentoo uses this patch to solve the problem:

From: Jory A. Pratt <anarchy@gentoo.org>

getcontext is only avaliable on glibc systems

--- a/tools/profiler/core/platform-linux-android.cpp
+++ b/tools/profiler/core/platform-linux-android.cpp
@@ -497,8 +497,10 @@ static void PlatformInit(PSLockRef aLock) {}
ucontext_t sSyncUContext;

void Registers::SyncPopulate() {
+#if defined(GLIBC)
if (!getcontext(&sSyncUContext)) {
PopulateRegsFromContext(*this, &sSyncUContext);
}
+#endif
}
#endif

maybe you can contact the author via his gentoo email adress for further ideas on how to fix this?

The priority flag is not set for this bug.
:gerald, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(gsquelart)

The breakpad_getcontext option sounds good to me, as we do use breakpad for a few things already.
Otherwise, maybe a combination of comment 0 and comment 1 would work best, i.e.: Add a configure check defining HAVE_GETCONTEXT, and use that (instead of GLIBC) around the getcontext() call.
(Not a Linux user here, so I may be very wrong!)

Markus, it's more your domain, any thoughts please?

Flags: needinfo?(gsquelart) → needinfo?(mstange)
Priority: -- → P2

All these solutions sound fine to me, I don't have an opinion on which one should be taken.

Flags: needinfo?(mstange)
Assignee: nobody → mforney

So I've been looking at getting other breakpad musl build failures fixed (https://bugs.chromium.org/p/google-breakpad/issues/detail?id=631), and I discovered that the code in breakpad_getcontext to save the FP registers is not correct for non-Android platforms.

The profiler does not use the FP registers (see PopulateRegsFromContext), so it doesn't matter for this purpose, but to fix the issue in general for the crash reporter and the profiler, it is probably best to find a fix for breakpad, and then adapt the solution here.

So maybe when getcontext is not available, we should just disable the profiler or HAVE_NATIVE_UNWIND, or make SyncPopulate a no-op as in the Gentoo patch. Does anyone know what the implications are of making SyncPopulate a no-op? Does the profiler still mostly function with some missing feature, or would it just be broken?

Right now, the main issue is that the profiler can't be built on musl libc, and it can't be disabled by a configure switch. I don't think most musl distributions care about firefox missing profiler functionality, so one of these other options might be the right fix in the mean time. When the issue is solved in breakpad, we can revisit this.

Attachment #9131406 - Attachment is obsolete: true

Mike, can you comment about the effects of disabling SyncPopulate, as in to the Gentoo patch above?

Flags: needinfo?(mh+mozilla)

Redirecting to Gerald.

Flags: needinfo?(mh+mozilla) → needinfo?(gsquelart)

I've made some changes to upstream breakpad to use breakpad_getcontext on other Linux systems that lack getcontext:
https://chromium.googlesource.com/breakpad/breakpad/+/e780d58fd7b6eda76065327a9921d6157e4a7964%5E%21/

So I believe all we need to do is to update breakpad to the latest master version.

This includes several fixes required to build against musl libc.

Conflicts were resolved in 00-arm-exidx-rollup.patch and
10-json-upload.patch. The others applied cleanly.

breakpad_getcontext.S is now built conditionally based upon the
available of getcontext() from libc, rather than only on Android.
The profiler was updated to reflect this change.

Thanks for tackling this Michael. I've been postponing breakpad updates because of bug 1588742 but since that's going to take a few months I'm more than happy to take this patch now. It will take me at least a couple of days to review it because I have to test it across all our platforms to ensure that the crash reports we get are what we expect. In the past we had some breakage with new breakpad versions whereas the crashes were either missing modules or caused some of our automation machinery to fail and the only way to be sure is slow, manual testing.

Oh sorry, I didn't notice the NI. Clearing it, as I see mforney updated breakpad instead.
But please ping me again if you still have questions (though :mstange would know better about your question in comment 7.)

Flags: needinfo?(gsquelart)

This rolls up all the relevant client changes in the patches in this update and manually applies them to the breakpad-client directory. It needs to be included in the patch.

Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Attachment #9133793 - Attachment description: Bug 1615713 - Update breakpad to upstream revision 23e6fbf571a6638ce9642fc1f50baf1af0f54268 → Bug 1615713 - Update breakpad to upstream revision 5bba75bfd6ec386b8e3af0b91332388a378135bf

This patch fixes the breakage in the uploader on Windows. It applies on top of the existing patch and adjusts 09-json-upload.patch to accordingly.

Pushed by cbrindusan@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/5b5137d9329c
Update breakpad to upstream revision 5bba75bfd6ec386b8e3af0b91332388a378135bf r=gsvelto
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla76
Regressions: 1623889
No longer regressions: 1623889
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: