Open Bug 1595687 Opened 5 years ago Updated 4 years ago

Running a second profiler browser tests is much slower

Categories

(Core :: Gecko Profiler, defect, P2)

defect

Tracking

()

People

(Reporter: mozbugz, Unassigned)

References

Details

Follow-up to bug 1586105 comment 22, Julien said:

I have the same problem locally.
In browser_test_profile_multi_frame_page_info it takes 120ms while in browser_test_profile_single_frame_page_info 1.7sec.
I don't have the problem if I only run browser_test_profile_single_frame_page_info, only when they're run in sequence.
I couldn't make mach test to run them in the other order to see if this was producing a different result.

I see the tests aren't pausing the profiler before capturing, should we do that to make things faster?

Gerald replied:

I can't think why a 2nd profiler run should be slower: When it is stopped, the buffer and sampler should be destroyed, so there "should" be no effect on the next run.
Yes, using an external profiler can be a good way to profile the profiler!
Or as a first step we could add some logging during the capture, to verify that the slow-down is actually there and measure it.

Note that this is in a fission context. Is it possible that it takes more time because we have more threads running, being the second test, and thus more data to transfer?

Priority: P1 → P2

Recent comments with some data in bug 1586105 comment 45:

I [Julien] looked at the measurements again from the latest error log linked above:

For the multi_frame test:

[task 2020-03-29T06:26:45.198Z] 06:26:45     INFO - Capture the profile data.
[task 2020-03-29T06:28:00.724Z] 06:28:00     INFO - Check if the captured pages are the ones with correct values we created.

=> 1m15 to capture

For the single_frame test:

[task 2020-03-29T06:28:45.526Z] 06:28:45     INFO - Capture the profile data.
[task 2020-03-29T06:31:37.333Z] 06:31:37     INFO - Check if the captured page is the one with correct values we created.

=> 2m52 to capture

From a recent log (https://treeherder.mozilla.org/logviewer.html#?job_id=302715554&repo=mozilla-central):
for multi_frame:

[task 2020-05-18T10:23:18.139Z] 10:23:18     INFO - Capture the profile data.
[task 2020-05-18T10:23:54.725Z] 10:23:54     INFO - Check if the captured pages are the ones with correct values we created.

=> 36s

for single frame:

[task 2020-05-18T10:24:17.125Z] 10:24:17     INFO - Capture the profile data.
[task 2020-05-18T10:25:20.236Z] 10:25:20     INFO - Check if the captured page is the one with correct values we created.

=> 1m03

So there's still a big difference, and I think we should look at it more.
Maybe we should 1/ look at the data, see if we have more data in the second case, and 2/ understand the difference with both cases.

Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.