Closed Bug 1449195 Opened 6 years ago Closed 6 years ago

Process raptor test results and format for output

Categories

(Testing :: Raptor, enhancement)

enhancement
Not set
normal

Tracking

(firefox62 fixed)

RESOLVED FIXED
mozilla62
Tracking Status
firefox62 --- fixed

People

(Reporter: rwood, Assigned: rwood)

References

Details

Attachments

(1 file, 1 obsolete file)

Aggregate/process results according to test definition, as we do in talos now. Format the final test result and dump to stdout as a PERFHERDER_DATA json blob so it can be ingested in production.
Assignee: nobody → rwood
Status: NEW → ASSIGNED
Comment on attachment 8972675 [details]
Bug 1449195 - Process raptor test results and format for output (WIP)

https://reviewboard.mozilla.org/r/241214/#review247104


Code analysis found 2 defects in this patch:
 - 2 defects found by mozlint

You can run this analysis locally with:
 - `./mach lint path/to/file` (JS/Python)


If you see a problem in this automated review, please report it here: http://bit.ly/2y9N9Vx


::: testing/raptor/raptor/control_server.py:23
(Diff revision 1)
>  
>  here = os.path.abspath(os.path.dirname(__file__))
>  
> +results = {}
>  
>  class MyHandler(BaseHTTPServer.BaseHTTPRequestHandler):

Error: Expected 2 blank lines, found 1 [flake8: E302]

::: testing/raptor/raptor/control_server.py:58
(Diff revision 1)
>          post_body = self.rfile.read(content_len)
>          # could have received a status update or test results
>          data = json.loads(post_body)
>          LOG.info("received " + data['type'] + ": " + str(data['data']))
> +        if data['type'] == 'webext-results':
> +            results = data

Error: Local variable 'results' is assigned to but never used [flake8: F841]
Comment on attachment 8972675 [details]
Bug 1449195 - Process raptor test results and format for output (WIP)

https://reviewboard.mozilla.org/r/241214/#review247288


Code analysis found 3 defects in this patch:
 - 3 defects found by mozlint

You can run this analysis locally with:
 - `./mach lint path/to/file` (JS/Python)


If you see a problem in this automated review, please report it here: http://bit.ly/2y9N9Vx


::: testing/raptor/raptor/control_server.py:27
(Diff revision 2)
> -class MyHandler(BaseHTTPServer.BaseHTTPRequestHandler):
> +def MakeCustomHandlerClass(results_handler):
> +
> +    class MyHandler(BaseHTTPServer.BaseHTTPRequestHandler, object):
> +
> +        def __init__(self, *args, **kwargs):
> +             self.results_handler = results_handler

Error: Indentation is not a multiple of four [flake8: E111]

::: testing/raptor/raptor/control_server.py:28
(Diff revision 2)
> +
> +    class MyHandler(BaseHTTPServer.BaseHTTPRequestHandler, object):
> +
> +        def __init__(self, *args, **kwargs):
> +             self.results_handler = results_handler
> +             super(MyHandler, self).__init__(*args, **kwargs)

Error: Indentation is not a multiple of four [flake8: E111]

::: testing/raptor/webext/raptor/runner.js:290
(Diff revision 2)
>    postToControlServer("results", results);
>  }
>  
>  function postToControlServer(msgType, msgData) {
>    // requires 'control server' running at port 8000 to receive results
> -  var url = "http://127.0.0.1:8000/";
> +  var url = "http://127.0.0.1:" + cs_port +"/";

Error: Infix operators must be spaced. [eslint: space-infix-ops]
Attachment #8972675 - Attachment is obsolete: true
Comment on attachment 8972914 [details]
Bug 1449195 - Process raptor test results and format for output;

:jmaher, would you mind having a look at this so far and let me know what you think? I'm trying to reduce the complexity of the code that summarizes/processes results but at the same time re-using a bunch from talos as to not completely re-invent the wheel.

Example of the PERFHERDER_DATA output from a local run:

12:23:28     INFO -  output PERFHERDER_DATA: {"framework": {"name": "raptor"}, "suites": [{"subtests": [{"name": "fnbpaint", "value": 315.5, "replicates": [730, 303, 337, 304, 279, 297, 329, 316, 329, 305, 331, 315, 473, 441, 333, 313, 327, 309, 353, 302, 303, 284, 318, 327, 288], "lower_is_better": true, "unit": "ms", "alert_threshold": 2.0}], "extraOptions": ["e10s", "stylo"], "name": "raptor-firefox-tp6"}]}

A couple of questions:

- I just drop the first value, and use the median. Since raptor will be running two types of tests - pageloader and benchmarks - do we need to have the tests config this? i.e. will there be pageload tests that won't use the mean? Also same for how many replicates to drop - do we need to have that in the test INI also or can we just apply the same algorithm for all pageload tests?

- I'm not sure how to take this further for benchmarks like speedometer. For benchmarks we don't drop any initial replicates correct? I find the talos results processing code very complicated... but I think I'm on the right track. Speedometer raptor hasn't landed in try yet anyway so I could land this for pageload type tests and then update for benchmarks when I land speedometer (the next priority).

Thanks!
Attachment #8972914 - Flags: feedback?(jmaher)
Comment on attachment 8972914 [details]
Bug 1449195 - Process raptor test results and format for output;

https://reviewboard.mozilla.org/r/241468/#review249326

f+; a few nits/questions.  I am fine with this for tp6; lets get that running.  If the code is too confusing for results/filters (and I agree it is), lets plan to fix it- the path of least resistance right now is to copy from talos since that has all the meta data we need.  I view that as a July thing to redo the results handling.

::: testing/raptor/raptor/filter.py:5
(Diff revision 7)
> +# This Source Code Form is subject to the terms of the Mozilla Public
> +# License, v. 2.0. If a copy of the MPL was not distributed with this
> +# file, You can obtain one at http://mozilla.org/MPL/2.0/.
> +
> +# originally taken from talos

can you link to the source?  maybe a moz.build rule to use the same source file instead?

::: testing/raptor/raptor/filter.py:18
(Diff revision 7)
> +
> +Each filter is a simple function, but it also have attached a special
> +`prepare` method that create a tuple with one instance of a
> +:class:`Filter`; this allow to write stuff like::
> +
> +  from talos import filter

mention of talos here, not sure if that is valid for raptor

::: testing/raptor/raptor/filter.py:169
(Diff revision 7)
> +
> +@register_filter
> +@define_filter
> +def dromaeo_chunks(series, size):
> +    for i in range(0, len(series), size):
> +        yield series[i:i+size]

I don't see us running dromaeo on here, but lets leave that for a followup- ideally we have a single file that has all our filtering.

::: testing/raptor/raptor/manifest.py:60
(Diff revision 7)
> +    if test_details.get("unit", None) is not None:
> +        test_settings['raptor-options']['unit'] = test_details['unit']
> +    if test_details.get("lower_is_better", None) is not None:
> +        test_settings['raptor-options']['lower_is_better'] = bool(test_details['lower_is_better'])
> +    if test_details.get("alert_threshold", None) is not None:
> +        test_settings['raptor-options']['alert_threshold'] = float(test_details['alert_threshold'])

could we set a default value here and use that instead, so something like:
test_settings['raptor-options']['unit'] = test_details.get('unit', 'ms')

that would allow us to avoid the check each time and use the default value;  That would be useful unless we have code to handle the cases where this data is missing?

::: testing/raptor/raptor/output.py:6
(Diff revision 7)
> +
> +# This Source Code Form is subject to the terms of the Mozilla Public
> +# License, v. 2.0. If a copy of the MPL was not distributed with this
> +# file, You can obtain one at http://mozilla.org/MPL/2.0/.
> +
> +# originally taken from talos

either add a link here, or use a moz.build rule to copy this file around.

::: testing/raptor/raptor/raptor.py:163
(Diff revision 7)
>  
>      for next_test in raptor_test_list:
>          raptor.run_test(next_test)
>  
>      raptor.process_results()
> +

nite: extra blank line here.

::: testing/raptor/raptor/results.py:24
(Diff revision 7)
> +    def __init__(self):
> +        self.results = []
> +
> +    def add(self, new_result_json):
> +        # add to results
> +        LOG.info("received results")

add more context here, something like "Received results in RaptorResultsHandler.add"

::: testing/raptor/raptor/results.py:52
(Diff revision 7)
> +        self.measurements = test_result_json['measurements']
> +        self.lower_is_better = test_result_json['lower_is_better']
> +        self.alert_threshold = test_result_json['alert_threshold']
> +        self.unit = test_result_json['unit']
> +        # legacy required for perfherder
> +        self.extra_options = ["e10s", "stylo"]

we might want to consider avoiding these as we are e10s/stylo be default.  If we do that , we run in parallel for a week (or two) then turn off the talos jobs.

Also here, why are we copying so many items from test_result_json, can we not do self.test_result = test_result_json ?

What else is in test_result_json that we don't need in self.* ?  Where are we using self.* ?
Attachment #8972914 - Flags: review+
Attachment #8972914 - Flags: review+
Attachment #8972914 - Flags: feedback?(jmaher)
Comment on attachment 8972914 [details]
Bug 1449195 - Process raptor test results and format for output;

https://reviewboard.mozilla.org/r/241468/#review249352

::: testing/raptor/raptor/manifest.py:60
(Diff revision 7)
> +    if test_details.get("unit", None) is not None:
> +        test_settings['raptor-options']['unit'] = test_details['unit']
> +    if test_details.get("lower_is_better", None) is not None:
> +        test_settings['raptor-options']['lower_is_better'] = bool(test_details['lower_is_better'])
> +    if test_details.get("alert_threshold", None) is not None:
> +        test_settings['raptor-options']['alert_threshold'] = float(test_details['alert_threshold'])

Good point thanks!

::: testing/raptor/raptor/results.py:52
(Diff revision 7)
> +        self.measurements = test_result_json['measurements']
> +        self.lower_is_better = test_result_json['lower_is_better']
> +        self.alert_threshold = test_result_json['alert_threshold']
> +        self.unit = test_result_json['unit']
> +        # legacy required for perfherder
> +        self.extra_options = ["e10s", "stylo"]

Ah yes thanks - it converts the json / dict received from the control server to an actual RaptorTestResult object to make it easier to handle/summarize etc.

I updated the code to create the obj dir from the dict.
Comment on attachment 8972914 [details]
Bug 1449195 - Process raptor test results and format for output;

https://reviewboard.mozilla.org/r/241468/#review249590


Code analysis found 1 defect in this patch:
 - 1 defect found by mozlint

You can run this analysis locally with:
 - `./mach lint path/to/file` (JS/Python)


If you see a problem in this automated review, please report it here: http://bit.ly/2y9N9Vx


::: testing/raptor/raptor/manifest.py:56
(Diff revision 8)
>          if "hero" in test_details['measure']:
>              test_settings['raptor-options']['measure']['hero'] = test_details['hero'].split()
>      if test_details.get("page_timeout", None) is not None:
>          test_settings['raptor-options']['page_timeout'] = int(test_details['page_timeout'])
> +    test_settings['raptor-options']['unit'] = test_details.get("unit", "ms")
> +    test_settings['raptor-options']['lower_is_better'] = bool(test_details.get("lower_is_better", True))

Error: Line too long (104 > 99 characters) [flake8: E501]
Attachment #8972914 - Flags: review?(jmaher)
Comment on attachment 8972914 [details]
Bug 1449195 - Process raptor test results and format for output;

https://reviewboard.mozilla.org/r/241468/#review249662


Code analysis found 2 defects in this patch:
 - 2 defects found by mozlint

You can run this analysis locally with:
 - `./mach lint path/to/file` (JS/Python)


If you see a problem in this automated review, please report it here: http://bit.ly/2y9N9Vx


::: testing/raptor/raptor/raptor.py:183
(Diff revision 10)
>  
> +    if not succes:
> +        # didn't get test results; test timed out or crashed, etc. we want job to fail
> +        os.sys.exit(1)
>  
>  if __name__ == "__main__":

Error: Expected 2 blank lines after class or function definition, found 1 [flake8: E305]

::: testing/raptor/raptor/results.py:36
(Diff revision 10)
> +        LOG.info("summarizing raptor test results")
> +        output = Output(self.results)
> +        output.summarize()
> +        return output.output()
> +
> +class RaptorTestResult():

Error: Expected 2 blank lines, found 1 [flake8: E302]
Comment on attachment 8972914 [details]
Bug 1449195 - Process raptor test results and format for output;

https://reviewboard.mozilla.org/r/241468/#review249660

I would prefer to copy output.py and filter.py from talos using moz.build so we don't duplicate code.  The custom benchmark summarization will be 100% in raptor when we move the benchmarks there, so talos will not need them- then I could see a great reduction in duplicated code.

Please fix this or file a followup bug.

::: testing/raptor/raptor/manifest.py:59
(Diff revisions 7 - 10)
> -    if test_details.get("unit", None) is not None:
> -        test_settings['raptor-options']['unit'] = test_details['unit']
> -    if test_details.get("lower_is_better", None) is not None:
> +    test_settings['raptor-options']['unit'] = test_details.get("unit", "ms")
> +    test_settings['raptor-options']['lower_is_better'] = \
> +        bool(test_details.get("lower_is_better", True))
> -        test_settings['raptor-options']['lower_is_better'] = bool(test_details['lower_is_better'])
>      if test_details.get("alert_threshold", None) is not None:
>          test_settings['raptor-options']['alert_threshold'] = float(test_details['alert_threshold'])

I noticed you still check for values here- do we not require alert_threshold for perfherder_data?  Same for a few lines up on page_timeout?

::: testing/raptor/raptor/raptor.py:176
(Diff revisions 7 - 10)
>      raptor.start_control_server()
>  
>      for next_test in raptor_test_list:
>          raptor.run_test(next_test)
>  
> -    raptor.process_results()
> +    succes = raptor.process_results()

why is this not success (the extra 's' is how it is normally spelled)

::: testing/raptor/raptor/raptor.py:181
(Diff revisions 7 - 10)
> -
>      raptor.clean_up()
>  
> +    if not succes:
> +        # didn't get test results; test timed out or crashed, etc. we want job to fail
> +        os.sys.exit(1)

we shoul really print a failure here

::: testing/raptor/webext/raptor/runner.js:222
(Diff revisions 7 - 10)
>  function setTimeoutAlarm(timeoutName, timeoutMS) {
> -  var timeout_when = window.performance.now() + timeoutMS;
> +  // webext alarms require date.now NOT performance.now
> +  var now = Date.now(); // eslint-disable-line mozilla/avoid-Date-timing
> +  var timeout_when = now + timeoutMS;
>    ext.alarms.create(timeoutName, { when: timeout_when });
> -  console.log("set " + timeoutName);
> +  console.log("now is " + now + ", set " + timeoutName + " to expire at " + timeout_when);

we we indicate in the log message this is a timeout alarm from raptor?
Attachment #8972914 - Flags: review?(jmaher) → review+
Comment on attachment 8972914 [details]
Bug 1449195 - Process raptor test results and format for output;

https://reviewboard.mozilla.org/r/241468/#review249660

Good idea thanks, filed Bug 1461424. Output.py is mostly custom raptor code except the benchmarks stuff.

> I noticed you still check for values here- do we not require alert_threshold for perfherder_data?  Same for a few lines up on page_timeout?

Right alert_threshold is required in the test INI, I'll add code to error out if that's not there. page_timeout defaults to 5000 ms if it's not specified in the test INI.

> we shoul really print a failure here

Ah yes thanks

> we we indicate in the log message this is a timeout alarm from raptor?

Yes the timeoutName is 'raptor_page_timeout' but I'll add more context thanks
Pushed by rwood@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/a9c20695f962
Process raptor test results and format for output; r=jmaher
https://hg.mozilla.org/mozilla-central/rev/a9c20695f962
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla62
Version: Version 3 → Trunk
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: