Closed Bug 1777431 Opened 2 years ago Closed 2 years ago

Some profiles miss the beginning of the parent process profile when child processes are not running out of buffer space

Categories

(Core :: Gecko Profiler, defect, P2)

defect

Tracking

()

RESOLVED FIXED
105 Branch
Tracking Status
firefox-esr91 --- unaffected
firefox-esr102 --- wontfix
firefox103 --- wontfix
firefox104 --- wontfix
firefox105 --- fixed

People

(Reporter: florian, Assigned: mozbugz)

References

(Regression)

Details

(Keywords: regression)

Attachments

(1 file)

One potential culprit (thanks for the idea Florian) is that we're using chunk-release TimeStamps from each process to sort them in time order, but these TimeStamps may be relative to their own process, and therefore would make the parent process look older, which would destroy the parent process data first once the overall memory usage hits the limit.

To be debugged...

Severity: -- → S2
Priority: -- → P2

After some debugging:
The timestamps are all absolute, so there is no problem there.

I did find that the logic that keeps the total profile buffer memory in check was not even started! So chunk updates from child processes were ignored, and these processes were allowed to grow to their local limit, which could explain recent "big profile" issues like bug 1758643.
I'm not sure yet if this missing logic is the cause of the shorter-parent problem. So I'll split the missing logic issue to its own bug, because it needs to be fixed anyway, and I'll keep debugging the issue here...

Since bug 1777890 landed (which properly releases old chunks in all processes), I haven't been able to reproduce this issue.
Florian, when possible could you please use your superpowers to see if you still experience this issue?

Flags: needinfo?(florian)

I captured https://share.firefox.dev/3nUq0hS today on a local build based off of a mozilla central from Fri Jul 08 12:13:39 2022 +0300, so I think it should contain your patch already. Unfortunately that profile reproduced the bug :-/.

Flags: needinfo?(florian)

Here is a profile of tests running on try, the profiler is started with MOZ_PROFILER_STARTUP in the environment. https://share.firefox.dev/3O37TRe The first 260s of the profile have data for the child processes but not the parent.

I believe I've found the cause:
When using MOZ_PROFILER_STARTUP=1 without an explicit buffer size limit, the base profiler uses its own default, which is lower than the Gecko Profiler's! (32MiB instead of 512MiB)
And since bug 1753192, the Gecko Profiler takes over the Base Profiler's chunk manager if it was active, and that chunk manager carries the local buffer size limit, so the Gecko Profiler ignores whatever limit it would have used (like its higher default).

Child processes were not affected, because they are always given an explicit buffer limit, which is based on what the Gecko Profiler would have used.
That's why we ended up with a shorter parent process profile.

The fix is trivial: Use the same defaults everywhere!

Assignee: nobody → gsquelart
Regressed by: 1753192

Set release status flags based on info from the regressing bug 1753192

The BASE_PROFILER_DEFAULT_...ENTRIES constants in BaseProfiler.h were smaller
than those in ProfilerControl.h, leading to a shorter profiling range in the
parent process!
Now these constants and some other shared ones are only defined in
BaseProfiler.h, and reused in ProfilerControl.h.

Pushed by gsquelart@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0da07ca1b2db
Use the same default constants in both profilers - r=florian
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 105 Branch

The patch landed in nightly and beta is affected.
:gerald, is this bug important enough to require an uplift?

  • If yes, please nominate the patch for beta approval.
  • If no, please set status-firefox104 to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(gsquelart)

I don't think it's important enough to uplift.

Flags: needinfo?(gsquelart)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: