Closed Bug 1821783 Opened 2 years ago Closed 1 years ago

Speedometer 3 profiles in CI appear to be garbage

Categories

(Testing :: Raptor, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: alexical, Assigned: kshampur)

References

Details

(Whiteboard: [sp3])

See this profile. It seems we stop profiling before the benchmark even runs. We should diagnose why this is so we can better diagnose Speedometer 3 performance anomalies in CI.

Whiteboard: [sp3]

This might be true for all benchmark tests. :dthayer, do you see this issue in other tasks like grandprix, or speedometer2?

Flags: needinfo?(dothayer)

(In reply to Greg Mierzwinski [:sparky] from comment #1)

This might be true for all benchmark tests. :dthayer, do you see this issue in other tasks like grandprix, or speedometer2?

Yeah it appears so, though the issue is less pronounced on Speedometer 2 it seems because we seem to start the benchmark earlier, so you do get to see a bit of the execution in the profile.

Flags: needinfo?(dothayer)

:kshampur, could you look into this?

Flags: needinfo?(kshampur)

This does look like it is happening on nearly all benchmarks (didn't check all, just a few more)

I think afterPageCompleteCheck is being triggered prematurely

I will keep taking a look

Severity: -- → S3
Priority: -- → P2

I tried playing around with --pageCompleteWaitTime and that increased the duration of the geckoprofiler for some benchmark tests like e.g. ares6 and speedometer2, but not speedometer3

Here is an example Try run with the extended pageCompleteWaitTime (increased from 5000 -> 30000)
As you can see from the profiler run of ares6 and sp, it went much longer than what currently exists in in-tree, but sp3 is still the same - profiler cuts right before the benchmark test starts

I am not sure exactly how S3 is configured but it seems to NOT start until after the geckoprofiler is stopped. So the some of the same event(s) that triggers afterPageCompleteCheck seems to be a prerequisite before triggering the start of S3
(if I understood this section of the browsertime sequence of events, loadEventEnd needs to happen first -> maybe some other stuff -> and THEN Speedometer 3 can start the benchmarking portion?)

:dthayer, :sparky, would you know how exactly S3 is set up such that it only seems to start after the whole page complete check sequence?
the speedometer3.js script doesnt seem immediately obvious why that's happening. I could try looking at the upstream repo tomorrow

Flags: needinfo?(kshampur)
Flags: needinfo?(gmierz2)
Flags: needinfo?(dothayer)

:kshampur, something to note is that this is coming from the extra-profiler-run. https://searchfox.org/mozilla-central/source/testing/raptor/raptor/browsertime/base.py#820-827

I suggest looking into that code because we never implemented it properly for benchmarks afair. It was only meant to work for pageload tests. It's possible we're missing some test options to let the test start. This is where the command is composed for the profiler run: https://searchfox.org/mozilla-central/source/testing/raptor/raptor/browsertime/base.py#651

S3 starts on this line: https://searchfox.org/mozilla-central/rev/e6a03adbf7930ae0cf131cc3274c80b2aae74e51/testing/raptor/browsertime/speedometer3.js#51

Then we use this loop to wait for it to complete: https://searchfox.org/mozilla-central/rev/e6a03adbf7930ae0cf131cc3274c80b2aae74e51/testing/raptor/browsertime/speedometer3.js#56,64

Flags: needinfo?(gmierz2)

I think Sparky is more equipped to answer here. I don't have much familiarity with how this all fits together in CI.

Flags: needinfo?(dothayer)
See Also: → 1822697
See Also: → 1823730

Closing this as I suspect this is fine now thanks to Bug 1823730
Feel free to reopen/ni? me if that is not the case

Status: NEW → RESOLVED
Closed: 1 years ago
Resolution: --- → FIXED
Assignee: nobody → kshampur
See Also: → 1892834
You need to log in before you can comment on or make changes to this bug.