Closed Bug 1231626 Opened 4 years ago Closed Last month

[meta] Rewrite pageloader to be more modern

Categories

(Testing :: Talos, defect)

defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jmaher, Unassigned)

References

Details

(Keywords: meta, Whiteboard: [releng][talos_wishlist])

the title says it all, some basic needs are:
* restartless
* no commandline needs
* a simpler interface for tests
* remove all the old cruft
* a real e10s interface

I need to assess all the tests and the code to come up with a list of features we use and identify which we should ensure remain :)
hey Joel, can you come up with a specification for pageloader?
Flags: needinfo?(jmaher)
lets start with the cli args which are defined:
      args.numCycles = cmdLine.handleFlagWithParam("tpcycles", false);
      args.numPageCycles = cmdLine.handleFlagWithParam("tppagecycles", false);
      args.startIndex = cmdLine.handleFlagWithParam("tpstart", false);
      args.endIndex = cmdLine.handleFlagWithParam("tpend", false);
      args.filter = cmdLine.handleFlagWithParam("tpfilter", false);
      args.useBrowserChrome = cmdLine.handleFlag("tpchrome", false);
      args.doRender = cmdLine.handleFlag("tprender", false);
      args.width = cmdLine.handleFlagWithParam("tpwidth", false);
      args.height = cmdLine.handleFlagWithParam("tpheight", false);
      args.profilinginfo = cmdLine.handleFlagWithParam("tpprofilinginfo", false);
      args.offline = cmdLine.handleFlag("tpoffline", false);
      args.noisy = cmdLine.handleFlag("tpnoisy", false);
      args.timeout = cmdLine.handleFlagWithParam("tptimeout", false);
      args.delay = cmdLine.handleFlagWithParam("tpdelay", false);
      args.noForceCC = cmdLine.handleFlag("tpnoforcecc", false);
      args.mozafterpaint = cmdLine.handleFlag("tpmozafterpaint", false);
      args.loadnocache = cmdLine.handleFlag("tploadnocache", false);
      args.scrolltest = cmdLine.handleFlag("tpscrolltest", false);
      args.disableE10s = cmdLine.handleFlag("tpdisable_e10s", false);
      args.rss = cmdLine.handleFlag("rss", false);


of these lets look at each one:
numCycles: 1 except tcanvasmark is 5, and we could switch to tppagecycles - not needed
numPageCycles: many values- should keep this
startIndex: not used
endIndex: not used
filter: not used
useBrowserChrome: only used for android (we still use pageloader on autophone)
doRender: not used
width: not used
height: not used
profilinginfo: used if we are running the profiler
offline: not used
noisy: not used
timeout: used for tp5 with a value of 5000; possibly default to that for all tests except custom benchmarks?
delay: not used
noForceCC: not used
mozafterpaint: used, half use it, half don't!
loadnocache: used on 5 tests
scrolltest: used on 1 test (tp5o_scroll)
disable_e10s: used by dromaeo_dom (no idea why)
rss: used by a few tests

in summary we use these features:
mozafterpaint, loadnocache, scrolltest, disable_e10s, rss, timeout, profilinginfo, numPageCycles

that is a much smaller set of features to support for configuring from the harness.  Now to figure out:
* interface for recording data
* interface for reporting data
* other features that pageloader does
other features that pageloader does:
* collects in process (main) memory usage: https://dxr.mozilla.org/mozilla-central/source/testing/talos/talos/pageloader/chrome/memory.js
* allows the browser to quit: https://dxr.mozilla.org/mozilla-central/source/testing/talos/talos/pageloader/chrome/quit.js
* logging to console/file (we need to support file for xperf runs): https://dxr.mozilla.org/mozilla-central/source/testing/talos/talos/pageloader/chrome/MozillaFileLogger.js

inside pageloader.js:
* read a manifest file (https://dxr.mozilla.org/mozilla-central/source/testing/talos/talos/pageloader/README#11).  This is a definition of URLs to load and modifiers to determine if pageloader measures the 'load' time or the test does it's own timing (could be json!)
** keep in mind we modify these .manifest files dynamically for a test run to specify the webserver- of course we could pass in the webserver to the pageloader addon and stop modifying these manifest files
* force garbage collecting/cycle collection before testing and between each page load
* instantiate and start/stop/add_marker to profiler
* create/clear timeout event (as defined by tptimeout- so pages can timeout if they take too long to load)
* we create a small window that drives the test, then we spawn a normal browser with the default toolbars and other *chrome* related pieces (except for android, we don't have the ability for multi windows).  this shouldn't be required to support this external window to drive the app.
* for mozafterpaint, we add listeners for 'mozafterpaint' in addition to 'load'.  both need to be received before we take the time used for the pageload.
* we loop with tppagecycles, this means:
   while cycle < tppagecycles:
     for url in urls:
       loadandmeasure(url)

* support for scrolling a page (tps, tp5o_scroll).  This is in the scroll.js file, but we drive the functions from scroll.js inside of pageloader.js
* support for recording data internally to pageloader:
** pageload: measure the time from before loadURI() to when 'load'&'mozafterpaint' events are recieved (we do not want to modify all pages)
** test measured: we inject a tprecordtime() call into the js context of the page in question and when the page is done, it reports data to the inserted api.
** we should have a default timeout for ALL pages which is sane and works well- we could measure pageload time for all, and if we have data coming from the test itself, we use that instead.
* exceptions for invalid data loading, failed tests to load in time, invalid data recorded, other exceptions
* reporting of data via report.js

Currently inside of pageloader.js there are many functions to support handling page load, these end up with a hacky chain of events (so hacky that perl5 source code looks sane) to support 'load' as well as 'load'&'mozafterpaint', then corresponding functions for e10s- then add extra functions for self reporting or pageload timing!  We can do e10s only and simplify our timing a bit.  In addition, we can clean up the model for tracking if we have load and mozafterpaint events.
let me tackle recording data.  We have the two models:
* pageload:
** date.now() - starttime

* test does own timing:
** currently supports a single value coming in, as well as an array of values including subtest names

reporting needs to handle:
* subtests
* values

Right now it also takes starttime, I don't think that is needed, it is hacked in as an implementation detail.

I propose modifying tests to ensure they have the proper format for reporting tests in json format:
{'testname1': 314.15,
 'testname2': 217.28,
 'testname3': ...}

testname would need to be identical to what it is today, that should be easy.  As for timing tests, we determine the testname here:
https://dxr.mozilla.org/mozilla-central/source/testing/talos/talos/pageloader/chrome/pageloader.js#526

The idea is that pageloader will load the urls and record a list of tests and values for each test.  This is a single value.  We can use tppagecycles to collect more than one data point for each test.  This means that internally, when we store the values, we will need to build an array and have:
{'testname1': [314.15, ...],
 'testname2': [217.28, ...],
 'testname3': [...]}

since we still report to graph server, we need to retain the existing format, but this can be done in our reporting output.

upon further thought, we don't need to restrict the incoming json to be single values, it can be a pagename and a set of values.

Lastly, this doesn't have to be inserted as an API, it could be a message that we pass data as a parameter.
the last part here is reporting.  Currently we build a little report that we print out to the console (and log file if provided).  Effectively we care about dump(report); which goes to the console.  Here is an example of the output from a11y:
__start_tp_report
_x_x_mozilla_page_load
_x_x_mozilla_page_load_details
|i|pagename|runs| |0;dhtml.html;1297;1353;1477;1364;1450;1470;1472;1472;1481;1489;1381;1483;1374;1378;1490;1377;1488;1482;1375;1378;1490;1383;1516;1483;1387 |1;tablemutation.html;330;330;331;331;330;331;339;329;329;329;328;330;330;332;331;331;332;330;331;332;331;331;329;330;329
__end_tp_report
__start_cc_report
_x_x_mozilla_cycle_collect,6867
__end_cc_report
__startTimestamp1450390034210__endTimestamp

This report gets parsed by the talos harness, then reported to graphserver/perfherder.  In fact we do some fancier stuff inside of report.js to make the logs show human friendly data include stdev, average, median.  That is optional, but really nice to have.

In general, we could output a json object containing the test name and the subtests and their respective values.  If we did this, we could parse this in the python talos harness and generate the appropriate data for graphserver/perfherder- including any name hacking.

There is one oddity- we do parse names and if there is a common prefix, we strip it.  That is why .TART is at the end of a subtest, not at the beginning.  This would need to be done for graphserver, and for continuity, we should retain the same names in perfherder.  This code is here:
https://dxr.mozilla.org/mozilla-central/source/testing/talos/talos/pageloader/chrome/report.js#6

Overall, we don't need to do much for reporting, and I would prefer a json formatted output as a result which we can parse with the harness.

I know this is a lot of randomness, if this isn't clear, please needinfo me to summarize this even more.
Flags: needinfo?(jmaher)
Blocks: 1220593
Whiteboard: [releng]
Blocks: 1233200
:mconley, you had expressed interest in writing this last month, we should touch base and see if we understand what is required and who has time when to do it.
Flags: needinfo?(mconley)
What's the timeline here? When does add-on signing totally burn us?
Flags: needinfo?(mconley) → needinfo?(jmaher)
I don't know the exact timeline- but I think having something in place by the end of Q1 is what we need.  :mconley, I think you have a vision in mind- if you get a shell, we could extend it in parallel :)
Flags: needinfo?(jmaher)
Reading through this, it supports my claim that pageloader has probably been mostly unloved and had stuff just bolted onto it willy-nilly for years.

If we were to rewrite it, I'd want to simplify it greatly, and to properly use the framescript model for communicating with content.

I'd also want to extract the things that tests besides pageloader need (like the ability to quit the browser, which we've already put into talos-powers), and make sure they don't accidentally leak in again. I would propose that common things should go into talos-powers, and pageloader be strictly interested in supporting the pageloader tests and nothing else.

One more thing - can I assume that XUL is right out? Like, this needs to be a 100% JS-based restartless add-on or something?
Flags: needinfo?(jmaher)
++ on talos-powers vs pageloader

My understanding is that XUL is out- lets design for the future and make this a true restartless add-on.
Flags: needinfo?(jmaher)
Blocks: 1088251
Whiteboard: [releng] → [releng][talos_wishlist]
:jmaher does this still need to block bug 1283638 (releng and ateam work for addon signing).  My understanding is that talos is working with addon signing enabled in beta
Flags: needinfo?(jmaher)
thanks for the ping- I removed the blocking association!
No longer blocks: 1233200
Flags: needinfo?(jmaher)
Plan to work on it in Q1
Assignee: nobody → atsai
Here's the plan for it. Since we are moving from add-on to webExtension, I tend to solve both of them. We'll separate the bug into three parts. There will be three sub-bugs for following up progress.

Part I:
* Add a webExtension manager in mozbase/mozprofile/mozprofile for add/remove webExtension for testing purpose.
* We'll also need to modify talos/talos/ffsetup.py a little bit to allow webExtension installation for testing

Part II:
* Modify talos/talos/pageloader to webExtension format
** This work will only make sure all required features work in webExtension pageloader

Part III:
* Remove redundant codes from base

I'll work on part II first to make sure moving pageloader to webExtension won't break any current testing.
thanks Al, this sounds good, looking forward to seeing this happen!
Blocks: 1353390
Note: If there's no use for '--no-chrome' anymore, we should remove this when we write the new pageloader web extension.
:atsai, is this still something you're interested in working on and have time for? I'd like to start work on the new pageloader web extension, however if you've already started it or wish to do it then I won't assign it to myself...
Flags: needinfo?(atsai)
Update: Had a great talk with :bsilverberg regarding web extensions. I had a high-level look at the existing APIs that are currently available to web extensions, and compared that with what pageloader currently does.

Doing a simple direct conversion of the current legacy pageloader add-on to a web extension is not really possible, at least not in its current form, and without alot of refactoring.

Discussed with :jmaher. At this point we decided the best approach will be first to clean-up and reduce the existing pageloader add-on. It's a pretty big beast with a bunch of legacy code and some features which are obsolete. We'll start there, with the goal of reducing it down, and ideally moving some tasks out of the add-on that can possibly be elsewhere. That will be a good starting point - I'm going to file some dependent bugs to begin that work.
Depends on: 1410206
Depends on: 1410210
Depends on: 1410211
Depends on: 1411550
Assignee: atsai → nobody
Summary: Rewrite pageloader to be more modern → [meta] Rewrite pageloader to be more modern
Flags: needinfo?(atsai)
Status: NEW → RESOLVED
Closed: Last month
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.