Raptor applink tests for Android Gecko vehicles
Categories
(Testing :: Raptor, enhancement, P3)
Tracking
(Not tracked)
People
(Reporter: nalexander, Assigned: nalexander)
References
Details
Attachments
(7 obsolete files)
We'd like to run "applink" tests for Android Gecko vehicles. Such tests are described in this report.
Technically, we need to:
- launch a vehicle with a URL (i.e.,
am start ... -d URL
) - capture measurements from that load, including
navigationStart
- determine the vehicle main process start time
- calculate the "applink X time" as
navigationStart - process start + X
This is a sibling to Bug 1522972, which added coldload tests to Raptor.
Updated•5 years ago
|
Assignee | ||
Comment 1•5 years ago
|
||
Assignee | ||
Comment 2•5 years ago
|
||
verify_android_device
requires a real _mach_context
in order to
install.
Depends on D37025
Assignee | ||
Comment 4•5 years ago
|
||
This is an unusual method. We find, ideally at the same instant: the
system uptime and a dummy process uptime. From these we determine the
jiffies per second of the system, which is almost always 100 but not
guaranteed to be that value.
From the current time, the system uptime, and the target process
uptime we determine the target process start time.
See the linked posts.
This is inherently racy, so we do it a few times and average. Under
the x86 emulator, it appears that the shell_output()
invocations are
also fragile, so we retry a few extra times to accommodate.
Depends on D37027
Assignee | ||
Comment 5•5 years ago
|
||
At its heart, this commit reports navigationStart
from the Web
Performance API in on-device milliseconds since the epoch and then
substracts the process start time of the main process.
The way it does this is opaque due to the existing Raptor
architecture. First, it's work to add a new test type, so I overload
coldload, since we're basically doing a single, special cold load.
Second, it's work to add an additional measurement checking function,
so I overload the existing "loadtime" flag and return two values. I
can't tell the harness the process start time because we don't have it
early enough for any of the existing communication mechanisms (we
write configuration JSON before we launch the process, and in any case
that races; and there's no convenient control server endpoint that
actually reveals that data dynamically). We push the process start
time calculation into the control server because it handles killing
the application process, and this needs to happen before that process
is killed. So we store up process start times and then subtract them
when processing the results. Again, it's work to put that filtering
logic anywhere else but in the existing results handler.
There are cleaner ways to achieve this goal, but all of them involve
tidying the division of responsibilities in the harness to an extent
that isn't worth the effort (at this time).
Depends on D37028
Assignee | ||
Comment 6•5 years ago
|
||
This is just a first approach. It works locally with ./mach raptor-test --app=geckoview --binary=org.mozilla.geckoview_example --test=raptor-tp6m-applink-1 --browser-cycles 3
, say. I don't really know how to add a new applink
"suite", or how to make sure that the firstNavigationStart
measurement is filtered, alerted, whatever, but others can pick that up.
rwood: if you could counter with any additional patches for verifying the perf herder output, grouping into a test set, etc, that would be helpful.
Comment 7•5 years ago
|
||
As per our discussion earlier today, we're going to put these patches on hold for now, and wait to see how the browsertime MVP work goes. Ideally, if the browsertime MVP goes well and we go ahead with the move to browsertime, it makes better sense to add the applink work on top of browsertime instead of raptor proper. If need be though we can revisit these patches and add this functionality to raptor instead.
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Assignee | ||
Comment 9•5 years ago
|
||
Barret: just throwing out my WIP. This depends on https://github.com/sitespeedio/browsertime/pull/1118, now landed upstream, but not yet in place for m-c. That's blocked on Bug 1607851, which is next in my queue.
Assignee | ||
Comment 10•5 years ago
|
||
Rebased on top of WIP for Node 10 + browsertime 8, some small tweaks, throwing things at the wall: https://treeherder.mozilla.org/#/jobs?repo=try&revision=342c4e83fcf3341bea60f3b1aeeef07e2df6b2e6.
Assignee | ||
Comment 11•5 years ago
|
||
OK, I basically have this working as I expect, but I'm running into the Window Recorder OOM issues that we see elsewhere. I'll update Bug 1585036 with more notes.
Assignee | ||
Comment 12•5 years ago
|
||
This adds a --applink
flag sibling to --cold
. I would much
prefer to generalize to --pageload-type
, but the precedent was set
and "cold" is difficult to update. History also required passing
through the --intent
flag, since the per-App intent defaulting
system is not particularly flexible.
To achieve the goal, --applink
tests:
-
configure browsertime/geckodriver to launch using
adb am start -a android.intent.action.VIEW -d URL
-
run a browsertime measurement driver script that knows not to
navigate to a URL -
patches the
navigationTiming
struct to include a "process shift" so
that they measure from process start time rather than the usual
navigation start time.
This doesn't quite work because the browsertime result processing
doesn't use navigationTiming
directly. It should be easy to
complete the patching using the template given here -- or to expose
different, applink-specific, metrics to perfherder.
This doesn't yet handle Chrome/chromedriver (that seems to require
patching chromedriver to launch with a URL).
For convenience this requires a very recent sitespeedio/browsertime.
There's nothing fundamental here, but many fixes (including to
--processStartTime
) have landed and make this all much easier.
Depends on D59235
Updated•5 years ago
|
Updated•5 years ago
|
Assignee | ||
Comment 13•5 years ago
|
||
https://phabricator.services.mozilla.com/D61383 is basically ready for review and for somebody else to push across the line. Tarek, can you work out who that might be?
Try push is at https://treeherder.mozilla.org/#/jobs?repo=try&tier=1%2C2%2C3&revision=feabc1feeb61e5c9aaa2468f3c303b57178d625c.
Assignee | ||
Updated•5 years ago
|
Updated•5 years ago
|
Comment 15•5 years ago
|
||
I've added a comment in Phabricator and added #perftest-reviewers for review. If possible, I'd like to avoid commandeering this patch if it's so close to landing so we can avoid additional delays in getting these tests running in CI.
Updated•5 years ago
|
Comment 16•5 years ago
|
||
While using this patch I noticed something that I didn't expect in browsertime:
a 2-second (android-only) delay after launch
https://github.com/sitespeedio/browsertime/blob/b16e8fbaa1ba8c1ee6a1c24604a546046db60290/lib/core/engine/iteration.js#L167
This is in the code-path of the applink startup time measurements so I'm expecting that it will affect the measurements.
Assignee | ||
Comment 17•5 years ago
|
||
(In reply to Andrew Creskey [:acreskey] [he/him] from comment #16)
While using this patch I noticed something that I didn't expect in browsertime:
a 2-second (android-only) delay after launch
https://github.com/sitespeedio/browsertime/blob/b16e8fbaa1ba8c1ee6a1c24604a546046db60290/lib/core/engine/iteration.js#L167This is in the code-path of the applink startup time measurements so I'm expecting that it will affect the measurements.
Good catch. We should guard that with processStartTime
, like https://github.com/sitespeedio/browsertime/blob/b16e8fbaa1ba8c1ee6a1c24604a546046db60290/lib/core/engine/iteration.js#L158.
Comment 18•5 years ago
|
||
Adding some notes since I tried this cold start applink test.
I haven't been able to cleanly capture visual metrics through this route because of the racing of browsertime vs the browser starting with the given intent.
Even with the delays mentioned in Comment 17 removed from browsertime, I found that the web sites were mostly or entirely loaded by the time browsertime was ready to start measuring.
What I ended up doing was to modify browsertime to create a "warm browser, cold page load applink scenario".
That is, the browser is started and conditioned as in traditional browsertime tests.
And then, instead of making the window.location
write, we send the view action intent via adb (piped through browsertime as an option).
https://paste.rs/sny
This works quite well, although it's capturing the scenario where the browser hadn't been free'ed by the OS.
Assignee | ||
Comment 19•5 years ago
|
||
(In reply to Andrew Creskey [:acreskey] [he/him] from comment #18)
Adding some notes since I tried this cold start applink test.
I haven't been able to cleanly capture visual metrics through this route because of the racing of browsertime vs the browser starting with the given intent.
Even with the delays mentioned in Comment 17 removed from browsertime, I found that the web sites were mostly or entirely loaded by the time browsertime was ready to start measuring.
My model for how browsertime works means that this is fine. browsertime really doesn't care how the page load gets initiated, it just needs to be able to read stuff from the DOM after the load has completed. (That's why it's okay to poll/sleep/whatever for pageload to fire, and then wait N seconds.) So "ready to start measuring" is fine: we're just going to be querying the DOM performance timers anyway, which are constant after the various events have completed.
Is my model wrong? Is there something else at play here?
Comment 20•5 years ago
|
||
(In reply to Nick Alexander :nalexander [he/him] from comment #19)
...
My model for how browsertime works means that this is fine. browsertime really doesn't care how the page load gets initiated, it just needs to be able to read stuff from the DOM after the load has completed. (That's why it's okay to poll/sleep/whatever for pageload to fire, and then wait N seconds.) So "ready to start measuring" is fine: we're just going to be querying the DOM performance timers anyway, which are constant after the various events have completed.
Is my model wrong? Is there something else at play here?
I should have been more specific: the problem is visual metrics. It was the orange frame insertion that you suspected could be a problem.
Browsertime doesn't have a chance to insert it before the pageload has begun.
Theoretically, I think I could capture visual metrics of the entire cold start: from home screen to app launch to pageload, an see what value could be extracted.
I will be trying out this patch again to collect navigation metrics on cold start app links.
Comment 21•5 years ago
•
|
||
FYI,
I noticed that in mozilla-browsertime
the default insertion of the Browsertime webextension adds ~1000ms to Fenix startup results (but curiously doesn't impact fennec).
AFAICT, this isn't a problem in raptor-browsertime because the extension is disabled and there is also a pref to disable extension blocklists (extensions.blocklist.enabled
)
https://searchfox.org/mozilla-central/rev/91f6c02fcf4c16f78fdc4417f61f192688294066/testing/raptor/raptor/browsertime/base.py#194
Comment 22•4 years ago
|
||
:acreskey, :sparky is this bug still relevant, or have we resolved this via mozperftest?
Comment 23•4 years ago
|
||
:davehunt, we can definitely close this -- the mozperftest view
tests are exactly applink tests.
They cover geckoview_example and Fenix. (I removed Reference Browser as it was not as useful as hoped).
e.g.
https://searchfox.org/mozilla-central/rev/35245411b9e8a911fe3f5adb0632c3394f8b4ccb/taskcluster/ci/perftest/android.yml#35
Updated•4 years ago
|
Updated•4 years ago
|
Updated•2 years ago
|
Description
•