profiles of local builds linked with --enable-linker=mold don't get symbolicated
Categories
(Core :: Gecko Profiler, defect, P1)
Tracking
()
People
(Reporter: emilio, Assigned: mstange)
References
Details
Attachments
(1 file)
175.93 KB,
text/plain
|
Details |
https://share.firefox.dev/3uU3f12 is an example.
Reporter | ||
Comment 1•2 years ago
|
||
https://crisal.io/tmp/mold-opt-libxul.so is the libxul.so for this.
Assignee | ||
Comment 2•2 years ago
|
||
I think the bug might be in shared-libraries-linux.cc.
The addresses shown in the profiler for libxul.so frames are too big. For example 0x7fa3c7d97987 and 0x7fa3c7dc73be don't fit into 32 bits. These addresses are supposed to be "relative" addresses, i.e. they should be relative to the library's load address / "base address".
Emilio, can you run the following in the Firefox-linked-with-mold build?
- On the browser console:
Services.profiler.sharedLibraries.find(l => l.name == 'libxul.so')
- In the terminal:
cat /proc/<firefox-parent-process-pid>/maps
and then paste both here? sharedLibrary.start
should be equal to the address of the first libxul.so mapping.
Reporter | ||
Comment 3•2 years ago
|
||
(In reply to Markus Stange [:mstange] from comment #2)
- On the browser console:
Services.profiler.sharedLibraries.find(l => l.name == 'libxul.so')
{
"start": 140689736204288,
"end": 140689932096320,
"offset": 0,
"name": "libxul.so",
"path": "/home/emilio/src/moz/gecko/obj-opt/dist/bin/libxul.so",
"debugName": "libxul.so",
"debugPath": "/home/emilio/src/moz/gecko/obj-opt/dist/bin/libxul.so",
"breakpadId": "7A1C91BBA8ACD92CB580C1077E2631EF0",
"arch": ""
}
- In the terminal:
cat /proc/<firefox-parent-process-pid>/maps
Will attach because not doing it makes me go over the comment character limit :)
sharedLibrary.start
should be equal to the address of the first libxul.so mapping
So... 0x7ff4e1c00000 == 140689736204288
, which means the start is right?
I took a profile on this same run just in case it helps: https://share.firefox.dev/3WorOyF
The addresses from libxul shown in the profiler are from before the first libxul.so mapping, so they seem off somehow? If you tell me where to look I'm happy to try debug this
Reporter | ||
Comment 4•2 years ago
|
||
Assignee | ||
Comment 5•2 years ago
•
|
||
Can you attach a profile that you capture with await Services.profiler.dumpProfileToFileAsync("/home/emilio/Desktop/raw-profile.json")
? (Start the profiler the usual way first.)
Reporter | ||
Comment 6•2 years ago
|
||
Ah, the end
of the library seems off tho, 140689932096320 is 0x7ff4ed6d1340
, while the last libxul mapping is 7ff4ed59a000-7ff4ed5e8000
, which is smaller than that. Not sure if that is a problem?
Reporter | ||
Comment 7•2 years ago
|
||
Too big for bugzilla, so https://crisal.io/tmp/raw-profile-mold.json
Assignee | ||
Comment 8•2 years ago
•
|
||
False alarm about the mappings - the ranges look fine. I forgot that unsymbolicated functions in the profiler still display the absolute addresses, not the relative addresses. Their internal relative address values look reasonable. For example, one of the hottest addresses is 0xadc73be, which symbolicates fine:
query_api % cargo run -- ~/Downloads/ /symbolicate/v5 '{"jobs":[{"memoryMap":[["mold-opt-libxul.so","36D208789F8DF932F8E647CB2BE1FAE60"]],"stacks":[[[0,182219710]]]}]}' | jq
{
"results": [
{
"stacks": [
[
{
"frame": 0,
"module_offset": "0xadc73be",
"module": "mold-opt-libxul.so",
"function": "style::dom_apis::query_selector",
"function_offset": "0xebe",
"function_size": "0x171a",
"file": "/home/emilio/src/moz/gecko/servo/components/style/dom_apis.rs",
"line": 642,
"inlines": [
{
"function": "<style::gecko::wrapper::GeckoNode as style::dom::TNode>::as_element",
"file": "/home/emilio/src/moz/gecko/servo/components/style/gecko/wrapper.rs",
"line": 479
},
{
"function": "style::dom_apis::collect_all_elements",
"file": "/home/emilio/src/moz/gecko/servo/components/style/dom_apis.rs",
"line": 230
},
{
"function": "style::dom_apis::query_selector_single_query",
"file": "/home/emilio/src/moz/gecko/servo/components/style/dom_apis.rs",
"line": 397
},
{
"function": "style::dom_apis::query_selector_fast",
"file": "/home/emilio/src/moz/gecko/servo/components/style/dom_apis.rs",
"line": 448
}
]
}
]
],
"found_modules": {
"mold-opt-libxul.so/36D208789F8DF932F8E647CB2BE1FAE60": true
}
}
]
}
Assignee | ||
Comment 9•2 years ago
•
|
||
The actual problem is a debug ID mismatch. The profile contains the value "B95396F750B7D947AAF1DC62F1771AF90", which was computed here, whereas the symbolication code computes a debug ID of "36D208789F8DF932F8E647CB2BE1FAE60".
One of the reasons for that mismatch is the fact that the library doesn't contain an ELF build ID: llvm-readelf --notes mold-opt-libxul.so
doesn't show any output.
I'm a bit surprised by this. The mold documentation specifically describes how to speed up build ID computation, in the Details section.
Anyway, we have fallback code to compute a debug ID even when no ELF build ID is present, by hashing the first 4096 bytes of the .text
section. But this fallback code is behaving differently: The one in mozilla-central specifically looks for the .text
section, whereas the one in the profiler symbolication code takes the first section of "kind" "text", which, in mold-opt-libxul.so
, happens to be the .plt
section.
It looks like I introduced this bug in May 2020.
Assignee | ||
Comment 10•2 years ago
|
||
Fixed in https://github.com/mstange/samply/commit/b600f99e398aaa953abcaa2357068cf508a4a9a9 .
I'll leave this bug open until the wasm blob in Firefox is updated.
Assignee | ||
Comment 11•2 years ago
|
||
(In reply to Markus Stange [:mstange] from comment #9)
One of the reasons for that mismatch is the fact that the library doesn't contain an ELF build ID:
llvm-readelf --notes mold-opt-libxul.so
doesn't show any output.I'm a bit surprised by this. The mold documentation specifically describes how to speed up build ID computation, in the Details section.
I've filed https://github.com/rui314/mold/issues/919 on this.
Comment 12•2 years ago
|
||
Here is the list of command line options given to mold for building libxul.so. As you can see, no --build-id option is passed to the linker. I believe somewhere in your build system, -Wl,--build-id is appended to the linker's command line, and that code isn't executed if the linker is mold. So please check your build system.
--sysroot=/home/ruiu/.mozbuild/sysroot-x86_64-linux-gnu
-z relro
--hash-style=gnu
--eh-frame-hdr
-m elf_x86_64
-shared
-o libxul.so
/home/ruiu/.mozbuild/sysroot-x86_64-linux-gnu/usr/lib/x86_64-linux-gnu/crti.o
/home/ruiu/.mozbuild/sysroot-x86_64-linux-gnu/usr/lib/gcc/x86_64-linux-gnu/7.5.0/crtbeginS.o
-L/home/ruiu/.mozbuild/sysroot-x86_64-linux-gnu/usr/lib/x86_64-linux-gnu
-L/home/ruiu/.mozbuild/sysroot-x86_64-linux-gnu/usr/lib/gcc/x86_64-linux-gnu/7.5.0
-L/home/ruiu/.mozbuild/sysroot-x86_64-linux-gnu/lib/x86_64-linux-gnu
-L/home/ruiu/.mozbuild/sysroot-x86_64-linux-gnu/usr/lib/x86_64-linux-gnu
-L/home/ruiu/.mozbuild/sysroot-x86_64-linux-gnu/lib
-L/home/ruiu/.mozbuild/sysroot-x86_64-linux-gnu/usr/lib
-z defs
-h libxul.so
/home/ruiu/mozilla-unified/obj-x86_64-pc-linux-gnu/toolkit/library/build/libxul_so.list
-lpthread
-rpath-link
/home/ruiu/.mozbuild/sysroot-x86_64-linux-gnu/lib/x86_64-linux-gnu
-rpath-link
/home/ruiu/.mozbuild/sysroot-x86_64-linux-gnu/usr/lib/x86_64-linux-gnu
-z noexecstack
-z text
-z relro
-z nocopyreloc
-Bsymbolic-functions
-rpath-link
/home/ruiu/mozilla-unified/obj-x86_64-pc-linux-gnu/dist/bin
-rpath-link
/usr/local/lib
../../../security/nss/lib/crmf/crmf_crmf/libcrmf.a
../../../js/src/build/libjs_static.a
/home/ruiu/mozilla-unified/obj-x86_64-pc-linux-gnu/x86_64-unknown-linux-gnu/release/libgkrust.a
../../../security/sandbox/linux/libmozsandbox.so
../../../config/external/nspr/pr/libnspr4.so
../../../config/external/nspr/libc/libplc4.so
../../../config/external/nspr/ds/libplds4.so
../../../config/external/lgpllibs/liblgpllibs.so
../../../security/nss/lib/nss/nss_nss3/libnss3.so
../../../security/nss/lib/util/util_nssutil3/libnssutil3.so
../../../security/nss/lib/smime/smime_smime3/libsmime3.so
../../../config/external/sqlite/libmozsqlite3.so
../../../security/nss/lib/ssl/ssl_ssl3/libssl3.so
../../../widget/gtk/mozgtk/libmozgtk.so
../../../widget/gtk/mozwayland/libmozwayland.so
--version-script
symverscript
-ldl
-lasound
-lrt
-lm
-ldl
-lX11
-lXcomposite
-lXdamage
-lXext
-lXfixes
-lXrandr
-lXrender
-lXtst
-lpthread
-lc
-lfreetype
-lfontconfig
-lgtk-3
-lgdk-3
-lpangocairo-1.0
-lpango-1.0
-latk-1.0
-lcairo-gobject
-lcairo
-lgdk_pixbuf-2.0
-lgio-2.0
-lgobject-2.0
-lglib-2.0
-ldbus-glib-1
-ldbus-1
-lxcb-shm
-lX11-xcb
-lxcb
-lXcursor
-lXi
-lstdc++
-lm
-lgcc_s
-lgcc
-lpthread
-lc
-lgcc_s
-lgcc
/home/ruiu/.mozbuild/sysroot-x86_64-linux-gnu/usr/lib/gcc/x86_64-linux-gnu/7.5.0/crtendS.o
/home/ruiu/.mozbuild/sysroot-x86_64-linux-gnu/usr/lib/x86_64-linux-gnu/crtn.o```
Reporter | ||
Comment 13•2 years ago
|
||
It seems since bug 1796518 the --build-id=sha1
is not added to local builds.
Comment 14•2 years ago
|
||
--build-id=sha1
isn't that slow with mold, so I recommend always appending that option if the linker is mold.
Assignee | ||
Comment 15•2 years ago
|
||
Thanks for checking! I've filed bug 1806470 on this.
Hey Markus, this bug is fixed after Bug 1808982, right? I guess we can close this bug as fixed as well.
Assignee | ||
Comment 17•2 years ago
|
||
Oh, yes, thanks. It's even double-fixed, with bug 1806470 being fixed as well.
Description
•