Closed Bug 1615066 Opened 4 years ago Closed 3 years ago

Symbolication of local Android and Linux builds does not work due to out-of-memory errors

Categories

(Core :: Gecko Profiler, defect, P2)

defect

Tracking

()

RESOLVED FIXED
92 Branch
Tracking Status
firefox-esr78 --- wontfix
firefox90 --- wontfix
firefox91 --- wontfix
firefox92 --- fixed

People

(Reporter: mstange, Assigned: mstange)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: regression)

Attachments

(1 file, 1 obsolete file)

When profiling local builds for Android or Linux, symbolication of libxul.so often fails. That's because that file is really really big and we try to extract symbols from it using a piece of WebAssembly.
My libxul.so for Android is 1.85GB big. And a libxul.so for Linux is 2.2GB big. The WebAssembly heap is limited to 2GB, so we run out of wasm memory during symbolication.

90% of this size is due to debug information. Stripping out the debug information can work around the problem but needs to be done manually. For example with the following command:

~/.mozbuild/clang/bin/llvm-strip --no-strip-all --strip-debug /Users/mstange/code/obj-m-android-opt/toolkit/library/build/libxul.so

Bug 1392234 might increase this limitation to 4GB soon. I'll wait for a bit more before I decide what to do here.

Blocks: 1616887

Julian Seward convinced me that I really want to implement a different approach, where we only read the parts of the files that we actually need. That should both be faster and use less memory.

A potential mozconfig workaround that was suggested by Mike Hommey:

export LDFLAGS=-Wl,--compress-debug-sections=zlib

(In reply to Markus Stange [:mstange] from comment #2)

where we only read the parts of the files that we actually need.

Implementing this doesn't seem straightforward because the object crate wants a slice to the entire file. This is in contrast to the DWARF information reader which has a Reader abstraction.

Not really advocating using lots of RAM, and that said: heaps > 2GB are coming but will realistically be operational 2021Q2 at the earliest.

(In reply to Markus Stange [:mstange] from comment #4)

(In reply to Markus Stange [:mstange] from comment #2)

where we only read the parts of the files that we actually need.

Implementing this doesn't seem straightforward because the object crate wants a slice to the entire file. This is in contrast to the DWARF information reader which has a Reader abstraction.

I've filed gimli-rs/object#269 to discuss changing object's API.

Update: Philip Craig has started implementing the proposed API in the object crate. I have a prototype which makes use of the new API and it looks really promising.

Assignee: nobody → mstange.moz
Status: NEW → ASSIGNED
See Also: → 1704946

Bug 1711843 shows that this issue may also produce totally wrong garbage.

Update on our side: 4GB heaps are now a thing in Wasm on 64-bit systems (FF89), so the bug should probably be retitled, but see comment 5.

So yeah, people are still seeing this despite the 4GB heap size, and I'm not sure why. Anyway, I have the partial-read approach working locally and I'm planning to finish this work soon.

Summary: Symbolication of local Android and Linux builds does not work due to 2GB WebAssembly heap size limitation → Symbolication of local Android and Linux builds does not work due to out-of-memory errors

(In reply to Markus Stange [:mstange] from comment #11)

So yeah, people are still seeing this despite the 4GB heap size, and I'm not sure why. Anyway, I have the partial-read approach working locally and I'm planning to finish this work soon.

Last time we looked, the unstripped libxul is much bigger than 4GB actually :/

(In reply to Julien Wajsberg [:julienw] from comment #12)

(In reply to Markus Stange [:mstange] from comment #11)

So yeah, people are still seeing this despite the 4GB heap size, and I'm not sure why. Anyway, I have the partial-read approach working locally and I'm planning to finish this work soon.

Last time we looked, the unstripped libxul is much bigger than 4GB actually :/

On my local linux build it's 2.1GB unstripped, and 227MB stripped. We might be doing multiple string copies.

Depends on: 1721109
Attachment #9227267 - Attachment description: WIP: Bug 1615066 - Add support for new profiler-get-symbols version which supports partial file reading. → Bug 1615066 - Add support for new profiler-get-symbols version which supports partial file reading. r=canaltinova
Attachment #9227266 - Attachment is obsolete: true
Blocks: 1704946
Pushed by mstange@themasta.com:
https://hg.mozilla.org/integration/autoland/rev/3c2ca00ff665
Add support for new profiler-get-symbols version which supports partial file reading. r=canaltinova
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 92 Branch

🥳🎉

Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: