Closed Bug 1937916 Opened 1 year ago Closed 1 year ago

libmegazord symbols missing in simpleperf profiles due to missing ELF build ID

Categories

(Application Services :: General, enhancement)

enhancement

Tracking

(firefox136 fixed)

RESOLVED FIXED
136 Branch
Tracking Status
firefox136 --- fixed

People

(Reporter: mstange, Assigned: bdk)

References

Details

(Whiteboard: [disco-])

Example: https://share.firefox.dev/3DilUe5

Thanks to bug 1889982 and bug 1921532 we now have full symbols for libmegazord.so on the Mozilla symbol server.

However, when profiling Fenix with simpleperf, libmegazord symbols are still missing, for example in this profile: https://share.firefox.dev/3DilUe5
The yellow boxes with hex addresses should have function names.

This is because the libmegazord.so file does not contain an ELF build ID. Without the build ID, we do not know where to look for the uploaded symbols.

The breakpad IDs which are used at the moment, for example 683823E165AC73382E64E0498E1266E80 for the 133.0.1 aarch64 build (aar file with binaries, aarch64 sym file), are generated by fallback code inside dump_syms based on the file contents. The crash reporter has similar fallback code, which is why our crash reports can successfully resolve libmegazord symbols.

However, Simpleperf does not have the equivalent fallback code.
(And samply import can't hash the file because it runs on the host, not on the Android device, so it doesn't have access to the binary. Simpleperf might copy the binary into its binary cache, but samply wouldn't know which file in the cache to look at, again because there's no build ID which would allow it to match it up to the right file.)

So we really want build IDs in the libmegazord.so files.

To verify whether an .so file contains an ELF build ID, you can run the following:

~/.mozbuild/clang/bin/llvm-readelf --notes libmegazord.so

If a build ID is present, there will be an NT_GNU_BUILD_ID entry.


I believe we just need to include --build-id in the linker flags.

I'm not 100% sure where these flags are set, but I think it's here in the rust-android-gradle plugin:

https://github.com/mozilla/rust-android-gradle/blob/c24dfbda14fe534527f6d2b93c7bfbda45e17032/plugin/src/main/kotlin/com/nishtahir/CargoBuildTask.kt#L231

                    environment("RUST_ANDROID_GRADLE_CC_LINK_ARG", "-Wl,-soname,lib${cargoExtension.libname!!}.so")

I think this just needs to be changed to:

                    environment("RUST_ANDROID_GRADLE_CC_LINK_ARG", "-Wl,--build-id,-soname,lib${cargoExtension.libname!!}.so")

To check whether the fix is successful, generate a new full-megazord-<version>.aar, unpack it, run llvm-readelf --notes on the file at jni/arm64-v8a/libmegazord.so, and check that there is an entry for NT_GNU_BUILD_ID.

Assignee: nobody → bdeankawamura
Whiteboard: [disco-]
Status: NEW → RESOLVED
Closed: 1 year ago
Flags: qe-verify+
Resolution: --- → FIXED
Target Milestone: --- → 136 Branch

This worked!

Here's a simpleperf profile from a recent Nightly: https://share.firefox.dev/3E4QNTW
The autofill::db frames on the left side were previously not symbolicated, and now they have symbols.

Thanks!

Wow, that's really awesome to see. Glad everything worked and thanks for putting this together.

You need to log in before you can comment on or make changes to this bug.