Closed Bug 1838824 Opened 1 year ago Closed 1 year ago

Investigate why we don't collect useful profiles for perf_reftest_singletons test

Categories

(Testing :: Talos, defect, P1)

Default
defect

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 1784248

People

(Reporter: canova, Assigned: canova)

References

(Blocks 1 open bug)

Details

(Whiteboard: [fxp])

Attachments

(1 obsolete file)

Currently when we capture a profile from the CI using the profiling tasks, we get a profile like this: https://share.firefox.dev/3paMVtp
If you look at this profile, there is only one sample per thread and then it's just blank.

Also if you look at the running times of this test from perfherder, it says this test takes around ~1300ms to run. But this profile only takes 2.1ms. Which is very short compared to the real test.

There can be a few reasons for this:

  1. We pause the profiler too eagerly.
  2. We don't wait for the profiler initialization.
  3. We don't use the profiler start/end APIs in the correct places.

Ideally it would be great to get some help from people who know talos more than me to investigate this.

Whiteboard: [fxp]

Ok, I know what's wrong now.

perf_reftest_singletons test is using special "test start" and "test end" handlings it reports the test values differently. That's why we don't go throught the regular Services.profiler.ResumeSampling() and Services.profiler.PauseSampling() calls.

We need to make sure that we call these functions in these functions:
https://searchfox.org/mozilla-central/rev/a5da23c8c9c1151dcdf0ca34b3cfd0a4d0fc3b48/testing/talos/talos/tests/perf-reftest-singletons/util.js#53
https://searchfox.org/mozilla-central/rev/a5da23c8c9c1151dcdf0ca34b3cfd0a4d0fc3b48/testing/talos/talos/tests/perf-reftest-singletons/util.js#61

I'm looking into it more. I will share some patches soon.

Normally, during talos tests, we always resume the profiler before test starts
and pause back again right after it finishes.
Here's where we resume:
https://searchfox.org/mozilla-central/rev/4e8f62a231e71dc53eb50b6d74afca21d6b254e9/testing/talos/talos/pageloader/chrome/pageloader.js#384-388
And here's where we pause again:
https://searchfox.org/mozilla-central/rev/4e8f62a231e71dc53eb50b6d74afca21d6b254e9/testing/talos/talos/pageloader/chrome/pageloader.js#730

But these only apply for tests where they don't do their own timing.
(See the if branch with !(plPageFlags() & TEST_DOES_OWN_TIMING).)

For perf_reftest_singletons tests, they do their own timing, which means that
they need to handle the profiler pause and resume calls themselves since they
have their own perf_start and perf_finish calls.

This patch adds TalosContentProfiler.resume and TalosContentProfiler.pause
calls to these functions so we can properly record the samples during the test
run. They shouldn't do anything if the profiler is not active.

Also, since these resume and pause functions are async, we need to await them.
I converted the functions to async to be able to await them, that's why
you'll see quite a lot changes. But they are all the same.

Assignee: nobody → canaltinova
Status: NEW → ASSIGNED
Severity: -- → S3
Priority: -- → P1
Attachment #9340321 - Attachment is obsolete: true

Instead of fixing this issue individually, I decided to fix the underlying issue that causes this bug instead. So with the patches from bug 1784248, this underlying issue is fixed and we don't need this patch anymore.

Here's a profile captured with the patches applied from bug 1784248: https://share.firefox.dev/46TFDdM You can see that the profile is not empty anymore.

Closing this as a duplicate of bug 1784248

Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Duplicate of bug: 1784248
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: