Closed Bug 1823730 Opened 2 years ago Closed 1 year ago

Implement new methodology for profiling benchmark tests in CI

Categories

(Testing :: Raptor, task, P1)

Default
task

Tracking

(firefox114 fixed)

RESOLVED FIXED
114 Branch
Tracking Status
firefox114 --- fixed

People

(Reporter: denispal, Assigned: kshampur)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [fxp][sp3])

Attachments

(1 file)

The current profiling methodology in CI expects a page load and is not well suited for benchmark tests. This often causes useless profiles for benchmark tests such as speedometer. Instead, I think we need a new methodology in which the test itself needs to start and stop the profiler during the benchmark measurement.

The general idea would be something like this:

  1. Implement a start/stop profiler command in browsertime.
  2. Pass a boolean to the benchmark test script to indicate if profiling or not. It can be passed on the command line with --browsertime.profiling true (see https://www.sitespeed.io/documentation/sitespeed.io/scripting/#pass-your-own-options-to-your-script)
  3. Modify the test script to start and stop the profiler manually. In speedometer, for example, we should probably switch to the interactive runner and profile each subtest. This would create a lot of profiles, but ensure we don't run out of profiler buffer space.
Severity: -- → S3
Priority: -- → P2
Whiteboard: [fxp]
Component: Performance → Raptor
Whiteboard: [fxp] → [fxp][sp3]

Something like this sort of seems to be working locally https://github.com/92kns/browsertime/commit/0c50a6c5a5bb79b9581a78e7e92638ac94f934e8
e.g. was able to profile an entire benchmark run depending where I placed my start/stop in our benchmark scripts.

example try https://treeherder.mozilla.org/jobs?repo=try&revision=58ddfb1cdcaa2687ba7d07a728f7cbcce113d7eb&selectedTaskRun=RD6A5cMVTOCBGJ_i7_AH9w.0

here is a permalink of the sp3 linux profile in the browsertime-results.tgz artifact. https://share.firefox.dev/3nDGJc1

The profiler-*.zip isnt in the artifacts itself so I need to double check why that isnt uploading

Ok here is a new Try run, profiler links show up in the artifacts
https://treeherder.mozilla.org/jobs?repo=try&revision=ebe9ebce4b10da3408f66fca8fc13e49460ff8d7

hi Denis, is this close to what you envisioned?

for something like this in browsertime you could put browserProfiler.start() here and stop() after here (or wherever, and similarly for S3)

the flag that we could pass would be something like this geckoProfilerCustom into our browsertime options variable here

Flags: needinfo?(dpalmeiro)

(In reply to Kash Shampur [:kshampur] ⌚EST from comment #3)

hi Denis, is this close to what you envisioned?

for something like this in browsertime you could put browserProfiler.start() here and stop() after here (or wherever, and similarly for S3)

the flag that we could pass would be something like this geckoProfilerCustom into our browsertime options variable here

This looks great, thanks Kash! A couple of minor notes:

  • I don't think you need firefoxConfig.geckoProfilerCustom. For the profiling flag, instead of adding a new option, you can just use --browsertime.profiling or something like you do for --browsertime.url and consume it here. I would assume it's undefined if not passed in.
  • I can also see from your posted profile above that we run out of buffer space in the profile. The default buffer size seems to be 100MB. You may try increasing the buffer size to 500MB or higher to see if we can capture the entire benchmark run.
  • Maybe throw an error if the user specifies --firefox.geckoProfiler and then tries to turn it on manually in the script while it's running.
  • Maybe consider renaming browserProfiler in the browsertime patch to geckoProfiler unless you're thinking of also adding in chrome support which would be cool. In that case, maybe something like commands.profiler.start()/commands.profiler.stop() makes more sense?
  • You'll also probably have to take care of the index, url parameters under the hood as that doesn't make much sense to me as a user calling profiler start/stop.
Flags: needinfo?(dpalmeiro)

Thanks for the feedback!

actually I made a mistake in my previous comment, I do pass something similar to --browsertime.profiling (well actually I call it exposed_gp as a placeholder, but not the important point) boolean just as you described, and geckoProfilerCustom is set here instead of geckoProfiler, when the test is a benchmark.

If I understood your point about the index, url , I did have an Initial prototype where I automatically get the index/url within the browsertime geckoprofiler stop function so that way the user calls stop() rather than stop(index, url) - I will go back to that implementation

Maybe consider renaming browserProfiler in the browsertime patch to geckoProfiler unless you're thinking of also adding in chrome support which would be cool. In that case, maybe something like commands.profiler.start()/commands.profiler.stop() makes more sense

Yeah, the intention was to keep it generic as Peter sort of alluded to having this for Chrome trace. And I imagine it will be useful when we get around to running Trace in our CI e.g. https://mozilla-hub.atlassian.net/browse/FXP-2438 (but this would be a future browsertime patch)
anyway browserProfiler -> profiler is a reasonable change, I can add that

Maybe throw an error if the user specifies --firefox.geckoProfiler and then tries to turn it on manually in the script while it's running.

Yea I was unsure as of yet if this should be handled on our script side or browsertime side

Anyway I will apply some/all of these changes and report back (buffer, paramaters, naming convention, etc)

Added some of the changes on the github side here

and here is a corresponding Try
This uses commands.profiler.start()/stop() (without having to pass index/url anymore), and the buffer we can set here ourselves. In this try I set it to 5 times the default from your link so it should be about 500mb

Maybe throw an error if the user specifies --firefox.geckoProfiler and then tries to turn it on manually in the script while it's running.

as mentioned in my previous comment unsure if we should handle on our side or in btime. Going to reach out to Peter for his thoughts on both this and the patch so far

Main thing I wanted to ask: Denis, does the profile look as you expect with the now increased buffer size?

Flags: needinfo?(dpalmeiro)

Yes, that profile does look a lot better!

Flags: needinfo?(dpalmeiro)

https://github.com/sitespeedio/browsertime/pull/1934 has been merged so we can begin working on the m-c side of this (though technically it is already been worked on for the Trys above)

Assignee: nobody → kshampur
Status: NEW → ASSIGNED
Priority: P2 → P1

Previously, the logic for profling raptor tests was intended for browsertime pageload tests. The profiles for benchmark tests were not that useful.
This patch uses a new command in browsertime which makes use of the exposed geckoprofiler start/stop commands to manually choose when to start and stop browsertime through our own custom scripts.

Attachment #9329109 - Attachment description: WIP: Bug 1823730 - Improve profling for raptor-browsertime benchmark tests. r?#perftest → Bug 1823730 - Improve profling for raptor-browsertime benchmark tests. r?#perftest
Blocks: 1829809
Blocks: 1829810
Attachment #9329109 - Attachment description: Bug 1823730 - Improve profling for raptor-browsertime benchmark tests. r?#perftest → WIP: Bug 1823730 - Improve profling for raptor-browsertime benchmark tests. r?#perftest
Attachment #9329109 - Attachment description: WIP: Bug 1823730 - Improve profling for raptor-browsertime benchmark tests. r?#perftest → Bug 1823730 - Improve profling for raptor-browsertime benchmark tests. r?#perftest
Blocks: 1831361
See Also: → 1831361
See Also: 1831361
Pushed by kshampur@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/7bdbd5815ea7 Improve profling for raptor-browsertime benchmark tests. r=perftest-reviewers,sparky
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 114 Branch
Regressions: 1832246
See Also: → 1835071
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: