Baseline profile runs have stopped detecting startup functions entirely
Categories
(Firefox for Android :: Performance, defect)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox138 | --- | unaffected |
| firefox139 | --- | unaffected |
| firefox140 | --- | fixed |
People
(Reporter: mstange, Assigned: npoon)
References
(Blocks 2 open bugs, Regression)
Details
(Keywords: regression, Whiteboard: [fxdroid] [group4])
Attachments
(3 files)
Steps to reproduce:
- Look at the
baseline-prof.txtartifact of agenerate-baseline-profile-firebase-fenixrun, for example from this range of autoland pushes.
Expected results:
Many lines should start with "HSP". The "S" is for "Startup".
Actual results:
After the landing of bug 1961852, none of the baseline-prof.txt artifacts have lines starting with "HSP" any more.
Except one: The generate-baseline-profile-firebase-fenix job that ran on the landing of bug 1961852 itself has HSP lines! I don't know why. My only guess is that it was a fluke - it's not just intermittent, literally none of the later jobs have HSP in them, as far as I can tell.
Without the "S", it means that baseline profiles are useless for startup performance.
| Reporter | ||
Updated•10 months ago
|
Comment 3•10 months ago
|
||
Set release status flags based on info from the regressing bug 1961852
:royang, since you are the author of the regressor, bug 1961852, could you take a look? Also, could you set the severity field?
For more information, please visit BugBot documentation.
| Reporter | ||
Comment 4•10 months ago
|
||
I noticed this bug by seeing that the simpleperf profiles collected in the various startup perf tests (e.g. perftest-android-hw-a55-aarch64-shippable-startup-fenix-homeview-startup-simpleperf, example job, example simpleperf profile) showed that all of Fenix's Java / Kotlin functions were running in the ART interpreter instead of being ahead-of-time compiled ("ART OAT"). Digging deeper, I saw that the adb shell dumpsys package dexopt output in these perf tests no longer contained [status=speed-profile] for the Fenix package. So these tests could have actually detected this bug. I've filed bug 1966234 so that we can catch this issue in the future, by failing startup perf CI tests if we don't see [status=speed-profile].
| Assignee | ||
Updated•10 months ago
|
Updated•10 months ago
|
| Assignee | ||
Comment 5•10 months ago
|
||
I think the regressor is actually the landing of the different CUJs over in Bug 1887820 but the problem doesn't actually lie with this patch itself. Titouan and I paired recently and we realized that the baseline profiles that get returned after generation is not the combination of the profiles (when it should be doing so). It just returns the profile of the last CUJ that runs. By this, I mean that taskcluster just retrieves the last baseline profile instead of combining them and then returning one merged profile of all of the CUJs. In many cases, this is no longer the startup profile or launch intent CUJ, which is why the returned baseline-prof.txt doesn't have the HSP lines. In Roger's patch, I think it contained HSP because either the launch intent CUJ or startup profile ran last.
The toolbar patch in Bug 1961852 switched the CUJ tests order around by ignoring some of them so it would make sense as to why we got the impression that the toolbar patch caused this regression
| Assignee | ||
Updated•10 months ago
|
| Assignee | ||
Comment 6•10 months ago
|
||
As a result, Markus and I have discussed to temporarily disable CUJ generation until we can figure this out. I have filed a follow up bug for re-enabling the CUJ generation over in Bug 1966496
| Assignee | ||
Comment 7•10 months ago
|
||
| Reporter | ||
Comment 8•10 months ago
•
|
||
(In reply to Nicholas Poon [:Nick] from comment #5)
The toolbar patch in Bug 1961852 switched the CUJ tests order around by ignoring some of them so it would make sense as to why we got the impression that the toolbar patch caused this regression
Well, the specific problem of "no more HSP in the baseline-prof.txt artifacts" started once the CUJ test order was switched, so marking this as a regression from bug 1961852 is still correct. I'll put the annotation back. It doesn't mean bug 1961852 is at fault for the problem, it just means that circumstances came together in such a way that the specific problem from comment 0 started happening with the landing of bug 1961852.
Thanks for digging into this!
Comment 10•10 months ago
|
||
| bugherder | ||
| Reporter | ||
Comment 11•10 months ago
|
||
This seems to have worked! Here's a task from a recent mozilla-central push and its baseline-prof.txt artifact starts with "HSP". Thanks!
| Reporter | ||
Comment 12•10 months ago
|
||
And here's an imported simpleperf profile from the homeview-startup test from Nick's try push: https://share.firefox.dev/3GVFgri
A lot less red than before.
Updated•10 months ago
|
| Assignee | ||
Updated•9 months ago
|
| Assignee | ||
Updated•9 months ago
|
Description
•