Closed Bug 1648507 Opened 4 years ago Closed 4 years ago

nsIProfiler.PauseSampling should not prevent markers from being recorded

Categories

(Core :: Gecko Profiler, task, P2)

task

Tracking

()

RESOLVED FIXED
mozilla80
Tracking Status
firefox80 --- fixed

People

(Reporter: florian, Assigned: mozbugz)

References

Details

Attachments

(1 file, 1 obsolete file)

Currently the PauseSampling/ResumeSampling APIs pause/resume the profiler entirely, but given their name I would expect only the sampling to be paused.

This is causing issues in Talos profiles when the tests pause the sampling between the things being measured.

Example TART profile: https://share.firefox.dev/31jCnK2
This is problematic especially for markers that have a begin/end markers where only one half of the pair gets recorded. In the above profile, see the DOMEvent 'transitionend' markers that start but don't finish, and the requestAnimationFrame callback markers that finish but don't have a start time.

Currently the profiler_can_accept_marker function at https://searchfox.org/mozilla-central/rev/cfaa250d14e344834932de4c2eed0061701654da/tools/profiler/public/GeckoProfiler.h#525 returns IsActiveAndUnpaused. If we change this to IsActive, the profile contains markers during the areas of the profile where sampling was paused.
Example TART profile with this change applied: https://share.firefox.dev/3dzYKNK

This change seems straightforward, but I don't know if there'll be unintended side effects.

Severity: -- → S3
Priority: -- → P2

This is funny, bug 1578329 implemented the inverse.

Seeing the errors of our ways!

But seriously, we may actually need to be more explicit in our APIs, and have separate ways to pause sampling only, and to pause all profiling:

  • We want to pause sampling in some cases, to address comment 0.
  • We want to pause all profiling when capturing the profile, to avoid spurious extra data, and reduce overhead while capturing.

Since we already have profiler_pause() and nsIProfiler::PauseSampling(), we could complement them with profiler_pause_sampling() and nsIProfiler::Pause() and couple them correctly.

(In reply to Gerald Squelart [:gerald] (he/him) from comment #3)

  • We want to pause all profiling when capturing the profile, to avoid spurious extra data, and reduce overhead while capturing.

Alternatively we could stop profiling completely, when a profile is captured. That's how the popup UI works anyway. Also, I think the issue of capturing spurious extra data no longer exists.

(In reply to Markus Stange [:mstange] from comment #4)

Alternatively we could stop profiling completely, when a profile is captured. That's how the popup UI works anyway. Also, I think the issue of capturing spurious extra data no longer exists.

Yes, I've been thinking about that for a while, but I'm not sure whether existing code or people rely on the Gecko Profiler not stopping after capture?
(This actually helped me once, because something went wrong when capturing, and I was able to then re-capture the whole profile, including the bit that went wrong! -- But that's probably not helpful assuming the profiler can't go wrong 😛)

And I'm a bit scared of stopping the profiler before capturing, I remember seeing some subtle things that happen in profiler_stop(), that would harm the capture. I'm sure it's possible to do it, but this bug here may not be the place for it. I've filed bug 1648675 to look into it.

In the meantime, I think separating pause-sampling and pause-all would be useful, and it would clarify the mismatched APIs.

For sure the Java system will remove all data if the profiler is stopped. And I'm pretty sure we'll get a NullPointerException too when capturing after stopping.
Also take care to apply whatever behavior changes you want to do to the java code too.

Attachment #9159365 - Attachment is obsolete: true
Assignee: florian → gsquelart

The profiler can be "paused", which stops sampling, and since bug 1578329 stops markers as well.

Some test suites use pausing between tests (to better differentiate the tests, to keep the profiler ready to run, and to lower the amount of recorded data). But this causes problems with some tracing markers, as their matching ends have not been recorded (e.g., an end marker is missing), which show up as very loooong markers.
To solve this, we need to be able to pause sampling only, but keep recording markers.
But we still need to be able to pause the whole profiler, in particular before capturing, to avoid recording anything around that time.

This big patch is mostly mechanical changes: Wherever there are "Pause" and "Unpause/Resume" profiler functions, we add matching "PauseSampling" and "UnpauseSampling/ResumeSampling" functions that only impact the periodic sampling loop; And existing "Pause/Unpause/Resume" imply pausing sampling as well.
Exceptions and extra work:

  • nsIProfiler (the JS API) already had Pause/ResumeSampling(), which misleadingly paused everything! Now they do the right thing, and we have Pause/Resume() as well.
  • All tests using Pause/ResumeSampling() now use Pause/Resume(), except for Talos tests that only pause sampling between tests; Added some extra Pause() calls to pause everything before capturing profiles.
  • GeckoJavaSampler doesn't handle pausing/resuming everything, this should be done in a follow-up bug.
  • Sampling-only pauses are not streamed into JSON. If needed, we should follow-up, with potential work on the front-end to deal with these.

Depends on D81491

I also wanted to add, that anytime I've added PauseSampling calls in code I've added, I've assumed that no markers would be generated. I did a quick skim of Gerald's patch, and I think it is handling that, but I wanted to call it out. I would say that any existing calls should be use the older behavior by default, and only opt in to true "PauseSampling" behavior if it's needed.

Pushed by gsquelart@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/8fbd31f301c7 Distinguish pausing sampling only from pausing the whole profiler - r=canaltinova,perftest-reviewers,geckoview-reviewers,agi
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla80
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: