Open Bug 1635810 Opened 5 years ago Updated 7 months ago

[meta] LUL initialization takes too long, slows down startup profiling and content process startup

Tracking

()

Status:

NEW

People

(Reporter: mstange, Unassigned)

References

(Depends on 2 open bugs, Blocks 2 open bugs)

Details

(Keywords: meta, Whiteboard: [fxp])

Markus Stange [:mstange]

Reporter

Description

•

5 years ago

•

Edited

On Linux and Android arm64, we use LUL for stackwalking. LUL's initialization takes a very long time. This manifests in the following ways:

Manually starting the profiler for the first time causes the browser to be unresponsive for a short time.
Content processes that start up during profiling are delayed. This is more serious with Fission.
It's most serious when profiling Firefox startup: During startup, many processes are launched, and each of them hits the initialization overhead: parent, GPU, network, add-ons, content. With multiple processes interacting, this delay distorts the profiles; we're now profiling an unrealistic scenario.

Let's use this bug as a meta bug and file individual bugs for specific mitigations of this problem.

Markus Stange [:mstange]

Reporter

Updated

•

5 years ago

Depends on: 1635811

Markus Stange [:mstange]

Reporter

Updated

•

5 years ago

Blocks: 1329212

Gerald Squelart (he/him) (not at Mozilla since 2022-09-15)

Updated

•

5 years ago

Priority: -- → P2

Gerald Squelart (he/him) (not at Mozilla since 2022-09-15)

Updated

•

4 years ago

Depends on: 1635442

Markus Stange [:mstange]

Reporter

Updated

•

4 years ago

Summary: [meta] LUL initialization takes too long, slows down startup profiling → [meta] LUL initialization takes too long, slows down startup profiling and content process startup

Gerald Squelart (he/him) (not at Mozilla since 2022-09-15)

Comment 1

•

4 years ago

Based on a Pernosco trace, in bug 1653473 comment 17 I believe I've shown that LUL initialization can take tens of seconds in some tests, causing them to fail intermittently. This could affect almost any test on LInux that relies on profiles from web content processes.

Blocks: 1653473, 1653640, 1653820, 1670783, 1622477, 1562944

Florian Quèze [:florian]

Updated

•

4 years ago

Depends on: 1709123

Markus Stange [:mstange]

Reporter

Comment 2

•

3 years ago

Hi Julian, it looks like we're running into a fairly fundamental assumption of LuL that's turning out to be problematic: The assumption that the initial conversion time into the optimized LuL format doesn't matter, and that we only need to optimize the time it takes to walk the stack during sampling. I think there are a number of different ways forward, and I'd love to hear your input on this!

I can imagine the following solutions:

Accept that we need to do the conversion once, but try to really only do it once per library, for example by caching the LuL representation on disk (bug 1635811).
Find ways to optimize the initial conversion.
Switch to a model that's at the other end of the spectrum: Do no work upfront and parse dwarf during unwinding
Switch to a hybrid model where we gradually build up an optimized representation of the dwarf data as we encounter new functions during sampling. This would do memory allocation during sampling (but after the sampled thread has been resumed - we're already copying the sampled thread's stack into a buffer anyway).

Thoughts?

Flags: needinfo?(jseward)

Markus Stange [:mstange]

Reporter

Comment 3

•

3 years ago

Profile of Lul initialization on a local Firefox build: https://share.firefox.dev/3If1ekq

Julian Seward [:jseward]

Updated

•

3 years ago

Depends on: 1754932

Mathew Hodson

Updated

•

3 years ago

Comment 4

•

3 years ago

Work to improve this is in progress. See in particular bug 1754932.

Flags: needinfo?(jseward)

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

2 years ago

See Also: → https://mozilla-hub.atlassian.net/browse/FXP-1916

Whiteboard: [fxp]

Markus Stange [:mstange]

Reporter

Updated

•

7 months ago

Blocks: 1616887

You need to log in before you can comment on or make changes to this bug.

Bugzilla

[meta] LUL initialization takes too long, slows down startup profiling and content process startup

Categories

(Core :: Gecko Profiler, defect, P2)

Tracking

()

People

(Reporter: mstange, Unassigned)

References

(Depends on 2 open bugs, Blocks 2 open bugs)

Details

(Keywords: meta, Whiteboard: [fxp])

Crash Data

Security

(public)

User Story

Description

Updated

Updated

Updated

Updated

Updated

Comment 1

Updated

Comment 2

Comment 3

Updated

Updated

Comment 4

Updated

Updated