Closed
Bug 755527
Opened 12 years ago
Closed 12 years ago
results processing should be more coherent end-to-end
Categories
(Testing :: Talos, defect)
Testing
Talos
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: k0scist, Unassigned)
References
Details
Attachments
(1 file, 9 obsolete files)
86.34 KB,
patch
|
jmaher
:
review+
|
Details | Diff | Splinter Review |
Our results chain now is somewhat Kafka-esque: - ttest.py gets a reduced results string from BrowserLogResults: http://hg.mozilla.org/build/talos/file/d00eba6055f5/talos/ttest.py#l369 - various pieces from this are returned to run_test.py: http://hg.mozilla.org/build/talos/file/d00eba6055f5/talos/run_tests.py#l561 - we collect these pieces and send them as a dict amongst other things[*] to send_to_graph: http://hg.mozilla.org/build/talos/file/d00eba6055f5/talos/run_tests.py#l585 send_to_graph does a lot of things. It.... - formats and filters the results as appropriate to ts, amo, or tp type tests: http://hg.mozilla.org/build/talos/file/d00eba6055f5/talos/run_tests.py#l174 - generates raw results which is a hand-constructed JSON string (See also https://bugzilla.mozilla.org/show_bug.cgi?id=744405), e.g. http://hg.mozilla.org/build/talos/file/d00eba6055f5/talos/run_tests.py#l183 - uploads the raw results to a hard-coded IP: http://hg.mozilla.org/build/talos/file/d00eba6055f5/talos/run_tests.py#l246 - uploads the formatted results to the old graphserver and harvests links: http://hg.mozilla.org/build/talos/file/d00eba6055f5/talos/run_tests.py#l253 - we do some other things too, e.g.: http://hg.mozilla.org/build/talos/file/d00eba6055f5/talos/run_tests.py#l586 I would prefer to move more results logic to results.py and the classes that handle it. Starting with the (questionable) assumption that you want results to be constructed in ttest.py: - what is now BrowserLogResults is made into a factory to create PageloaderResults or a new class, e.g. TsResults. These are accumulated in an array in ttest.py and returned as currently - (a new class may also be needed for AMOResults) - (We will also need to do something about counter results, perhaps beyond the scope of this bug) - PageloaderResults and TsResults (and AMOResults) will have methods to: * generate raw results (``def raw_results``) * generate results in graphserver format (``def graphserver_results``) - send_to_graph will then just loop over the results list and call result.graphserver_results() and post that to the graphserver - likewise for raw results This should make it easier to add multiple results URLs, to test results, and otherwise make results more extensible. As a huge bonus, but probably beyond the scope of the bug here, it would be nice to make results output pluggable. Doing these refactors certainly won't hurt this. See also: - bug 745907 - bug 744403 ---- [*] browser_config is a beast of its own right and is the subject of but 755374
Reporter | ||
Comment 1•12 years ago
|
||
:jmaher, I'd love to know your thoughts on this. I'm being somewhat hand-wavy here, as as long as the description is, it is mostly a napkin sketch.
Comment 2•12 years ago
|
||
love it. I would like to see when we finish a test that we call an output parser and it generates a results object. There are a couple scenarios here: 1) running >1 test 2) aux data (shutdown, counters, responsiveness, xperf, etc...) If you think about the above two scenarios, the results object should be related to the test. Maybe it should be test.results = results(). Ok, that would be too simple. anyway, when running multiple tests, we have an array of test objects, these test objects can store their respective results and send_to_graph can just take a list of test objects. Then we have the test name, options and the results all in one place. How do you take data from various sources and create a uniform object? I believe we would add results.addStdout(...), results.addCounterData(...), results.addXperfData(...), results.addResponsivenessData(...). Basically we could add data to the results object and when we serialize it as raw or graph server data it should know all about it. The original proposal sounds pretty logical. Hopefully my comment will help shape some of the lower level details.
Reporter | ||
Comment 3•12 years ago
|
||
This is also somewhat addressed in https://bugzilla.mozilla.org/show_bug.cgi?id=744409
See Also: → 744409
Reporter | ||
Comment 5•12 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #2) > love it. > > I would like to see when we finish a test that we call an output parser and > it generates a results object. There are a couple scenarios here: > 1) running >1 test So it may make sense to have a Results class which contains TestResults objects, the latter being tied to an individual test. I think diving in is the only way to really figure this out, but this is my inclination so far. Otherwise, if it is not cumbersome, we can have a list of results objects, but if there is a bunch we need to do across objects then a container object makes more sense. > 2) aux data (shutdown, counters, responsiveness, xperf, etc...) So, IIRC, counters and xperf (not sure about responsiveness) are not test-specific and are, instead, time specific (though maybe I'm wrong here? or there is a little of both?) If this is the case a master Results container class probably makes sense. > If you think about the above two scenarios, the results object should be > related to the test. Maybe it should be test.results = results(). Ok, that > would be too simple. In both scenarios, the e.g. PageloaderResults class should have the test dictionary available. > anyway, when running multiple tests, we have an array of test objects, these > test objects can store their respective results and send_to_graph can just > take a list of test objects. Then we have the test name, options and the > results all in one place. > > How do you take data from various sources and create a uniform object? I > believe we would add results.addStdout(...), results.addCounterData(...), > results.addXperfData(...), results.addResponsivenessData(...). Basically we > could add data to the results object and when we serialize it as raw or > graph server data it should know all about it. I'm not sure what you mean by .addStdout()...if you mean browser log, my plan is something like instantiating individual e.g. PageloaderResults comes from parsing the browser log and passing these to the PageloaderResults.__init__ which will then make the instance. The other methods definitely sound like something we will need. We will also need (for now) two output methods: def graphserver_output(): """returns a string of the PageloaderResults in graphserver format""" def raw_output(): """return a string of the PageloaderResults in raw (datazilla) format""" (I'm using PageloaderResults as a shim example since this class already exists. Obviously for TsResults, AMOResults, whatever will have these methods) > The original proposal sounds pretty logical. Hopefully my comment will help > shape some of the lower level details. Coolz. I can't look at this this week or next, but it will be the next large straw I'm going to chew on unless something comes up.
Comment 6•12 years ago
|
||
counters and stuff is not specific to time based. pageloader rss collection is done per page, not generic to a time interval. Likewise with xperf information. the addstdout was the pageloaderresults stuff.
Reporter | ||
Comment 8•12 years ago
|
||
This has taken much longer than I had hoped and its still not finished. In short - I've moved out most (hopefully more by the time of review) of the of the results logic to results.py - made --results_url and --raw_results_url lists instead of single items - add --cycles so that i can test without going insane - killed a bunch of silliness
Reporter | ||
Comment 9•12 years ago
|
||
I've also noticed a bug in how we accumulate ts results: http://hg.mozilla.org/build/talos/file/170c100911b6/talos/run_tests.py#l185 we overwrite raw_values per browser dump (which, for ts, there are usually 20 of them). Instead, we should append and deindent here. I will fix this
Reporter | ||
Comment 10•12 years ago
|
||
(In reply to Jeff Hammel [:jhammel] from comment #9) > I've also noticed a bug in how we accumulate ts results: > > http://hg.mozilla.org/build/talos/file/170c100911b6/talos/run_tests.py#l185 > > we overwrite raw_values per browser dump (which, for ts, there are usually > 20 of them). Instead, we should append and deindent here. I will fix this This also would affect tp (see further in the file) though since we run tp for only one cycle we manage to squeak by
Reporter | ||
Comment 11•12 years ago
|
||
This patch restructures how results reporting is done. It should be backwards compatible. The new architecture is as follows: - a TestResult object is instantiated per test - for each cycle, BrowserLogResults reads the browser log and returns a TsResults or PageloaderResults which is added to the TestResult object - each TestResult is added to the TalosResults object at the end of the test - calling TalosResults.output() with appropriate results urls (and raw results urls) gathers the appropriate output from each TestResult and outputs it to the according URL - (I also added --cycle for ease of testing) - (this architecture also made it easy to implement multiple --results_urls and --raw_results_urls as talked about in other bugs) In short, it is far from perfect but I think it is a good step forward
Attachment #631941 -
Attachment is obsolete: true
Attachment #632455 -
Flags: review?(jmaher)
Reporter | ||
Comment 12•12 years ago
|
||
Hadn't realize I had forgotten to save before uploading
Attachment #632455 -
Attachment is obsolete: true
Attachment #632455 -
Flags: review?(jmaher)
Attachment #632463 -
Flags: review?(jmaher)
Reporter | ||
Comment 13•12 years ago
|
||
fixed some variable names and extended __all__
Attachment #632463 -
Attachment is obsolete: true
Attachment #632463 -
Flags: review?(jmaher)
Attachment #632506 -
Flags: review?(jmaher)
Reporter | ||
Comment 14•12 years ago
|
||
Note that the way raw_results_urls works for file:// URLs, the file will be overwritten if you have more than one activeTest. This is a bug, but IMHO not one worth delaying the patch for as the feature is still useful in the case where you have only one test. Possible ways to fix this: - combine the data for multiple tests into one JSON string. This is not what datazilla expects currently and I'm guessing there are ramifications (as in, the bad kind) to this. This is also, IMHO, better take after bug 744405 as trying to hand serialize JSON is an exercise fraught with errors - provide a way of serializing output with a different filename (or, conceivably, url) per test. this will require a bit of thought. For instance, if you had --output-directory, this could be the equivalent for -a ts:tsvg of writing ts.txt, tsvg.txt # graphserver output ts.json, tsvg.json # datazilla output A way I don't want to fix this (but will if really needed for the short term): - do what --results_url does and just append to the file instead of overwrite. The reason I don't want to do this is that the resulting structure will not be JSON. We could take this an interim fix, but IMO it is replacing one bug for another
Reporter | ||
Comment 15•12 years ago
|
||
fix a few more bugs and fix the test. i removed the test for per-test filters as its pretty hacky and was mostly for my own verification that things worked. Longer term, we should reintroduce the test when there is less stubbing
Attachment #632506 -
Attachment is obsolete: true
Attachment #632506 -
Flags: review?(jmaher)
Attachment #632770 -
Flags: review?(jmaher)
Reporter | ||
Comment 16•12 years ago
|
||
pushed to try:https://tbpl.mozilla.org/?tree=Try&rev=724f7c61d184 Getting some graphserver unreachable errors....not really sure what is going on there, but will look as best i can from this vantage point
Reporter | ||
Comment 17•12 years ago
|
||
Also a few of these: RETURN: cycle time: 00:13:50<br> talos-r3-leopard-030: Stopped Wed, 13 Jun 2012 14:09:27 Generating results file: tp5r: Started Wed, 13 Jun 2012 14:09:27 Generating results file: tp5r: Stopped Wed, 13 Jun 2012 14:09:27 Generating results file: tp5r_pbytes_paint: Started Wed, 13 Jun 2012 14:09:27 Generating results file: tp5r_pbytes_paint: Stopped Wed, 13 Jun 2012 14:09:27 No results collected for: tp5r_main_rss: Error Wed, 13 Jun 2012 14:09:27 No results collected for: tp5r_content_rss: Error Wed, 13 Jun 2012 14:09:27 Generating results file: tp5r_responsiveness_paint: Started Wed, 13 Jun 2012 14:09:27 Generating results file: tp5r_responsiveness_paint: Stopped Wed, 13 Jun 2012 14:09:27 Generating results file: tp5r_shutdown_paint: Started Wed, 13 Jun 2012 14:09:27 Generating results file: tp5r_shutdown_paint: Stopped Wed, 13 Jun 2012 14:09:27 command timed out: 3600 seconds without output, attempting to kill I'll add some more diagnostics. Overall, this is somewhat difficult to diagnose from this end
Reporter | ||
Comment 18•12 years ago
|
||
(In reply to Jeff Hammel [:jhammel] from comment #16) > pushed to try:https://tbpl.mozilla.org/?tree=Try&rev=724f7c61d184 > > Getting some graphserver unreachable errors....not really sure what is going > on there, but will look as best i can from this vantage point The possible interesting bits: Error in collecting counter: Private Bytes Error in collecting counter: Main_RSS Error in collecting counter: Content_RSS Completed test tp5row: Stopped Wed, 13 Jun 2012 14:37:49 RETURN: cycle time: 00:26:54<br> talos-r3-leopard-021: Stopped Wed, 13 Jun 2012 14:37:49 Generating results file: tp5row: Started Wed, 13 Jun 2012 14:37:49 Generating results file: tp5row: Stopped Wed, 13 Jun 2012 14:37:49 Generating results file: tp5row_pbytes_paint: Started Wed, 13 Jun 2012 14:37:49 Generating results file: tp5row_pbytes_paint: Stopped Wed, 13 Jun 2012 14:37:49 No results collected for: tp5row_main_rss: Error Wed, 13 Jun 2012 14:37:49 No results collected for: tp5row_content_rss: Error Wed, 13 Jun 2012 14:37:49 Generating results file: tp5row_responsiveness_paint: Started Wed, 13 Jun 2012 14:37:49 Generating results file: tp5row_responsiveness_paint: Stopped Wed, 13 Jun 2012 14:37:49 Generating results file: tp5row_shutdown_paint: Started Wed, 13 Jun 2012 14:37:49 Generating results file: tp5row_shutdown_paint: Stopped Wed, 13 Jun 2012 14:37:49 WARNING: graph server unreachable WARNING: (54, 'Connection reset by peer') WARNING: graph server unreachable WARNING: (61, 'Connection refused') WARNING: graph server unreachable WARNING: (54, 'Connection reset by peer') WARNING: graph server unreachable WARNING: (61, 'Connection refused') WARNING: graph server unreachable WARNING: (61, 'Connection refused') Failed sending results: Stopped Wed, 13 Jun 2012 15:11:26 Generating results file: tp5row: Started Wed, 13 Jun 2012 15:11:26 Generating results file: tp5row: Stopped Wed, 13 Jun 2012 15:11:26 Generating results file: tp5row_pbytes_paint: Started Wed, 13 Jun 2012 15:11:26 Generating results file: tp5row_pbytes_paint: Stopped Wed, 13 Jun 2012 15:11:26 No results collected for: tp5row_main_rss: Error Wed, 13 Jun 2012 15:11:26 No results collected for: tp5row_content_rss: Error Wed, 13 Jun 2012 15:11:26 Generating results file: tp5row_responsiveness_paint: Started Wed, 13 Jun 2012 15:11:26 Generating results file: tp5row_responsiveness_paint: Stopped Wed, 13 Jun 2012 15:11:26 Generating results file: tp5row_shutdown_paint: Started Wed, 13 Jun 2012 15:11:26 Generating results file: tp5row_shutdown_paint: Stopped Wed, 13 Jun 2012 15:11:26 FAIL: Graph server unreachable (5 attempts) RETURN:(61, 'Connection refused') Traceback (most recent call last): File "run_tests.py", line 350, in <module> main() File "run_tests.py", line 347, in main run_tests(parser.config) File "run_tests.py", line 323, in run_tests talos_results.output(results_urls=results_urls, raw_results_urls=raw_results_urls) File "/Users/cltbld/talos-slave/talos-data/talos/results.py", line 135, in output raise e utils.talosError: "Graph server unreachable (5 attempts)\n(61, 'Connection refused')" program finished with exit code 1 elapsedTime=3632.542562 TinderboxPrint:<a href = "http://hg.mozilla.org/try/rev/724f7c61d184">rev:724f7c61d184</a> TinderboxPrint: cycle time: 00:26:54<br> TinderboxPrint:FAIL: Graph server unreachable (5 attempts) TinderboxPrint:(61, 'Connection refused')
Comment 19•12 years ago
|
||
Comment on attachment 632770 [details] [diff] [review] take 3 Review of attachment 632770 [details] [diff] [review]: ----------------------------------------------------------------- I really got lost in results.py. If you want an accurate review of that, give me a couple days or meet with me to walk through it. It seems a bit overkill and I don't understand why there is so much code there to just collect and report results. ::: talos/results.py @@ +108,5 @@ > + raw_results = test.datazilla_results(date=self.date, remote=self.remote, machine=self.title, browser_config=self.browser_config) > + > + for raw_results_url in raw_results_urls: > + # post the data > + # TODO: json encode do we need to json encode? ::: talos/ttest.py @@ +370,5 @@ > + # add the results from the browser output > + # browser_log_results = results.BrowserLogResults(filename=browser_log_filename, counter_results=counter_results, global_counters=global_counters) > + # browser_results = browser_log_results.browser_results > + # format = browser_log_results.format > + test_results.add(browser_log_filename, counter_results=counter_results) please remove the commented out code. @@ +388,1 @@ > please remove commented out code. @@ +394,5 @@ > + test_results.all_counter_results.extend([{key: value} for key, value in global_counters.items()]) > + > + # return results > + return test_results > +# return (all_browser_results, all_counter_results, format) and more commented code to remove.
Attachment #632770 -
Flags: review?(jmaher) → review-
Comment 20•12 years ago
|
||
As per a conversation about this, I am fine with results.py as it is, but I would strongly prefer that we split datazilla and graph server output into separate classes.
Reporter | ||
Comment 21•12 years ago
|
||
This moves all the output related stuff to output.py and makes it more pluggable.
Attachment #632770 -
Attachment is obsolete: true
Attachment #634182 -
Flags: review?(jmaher)
Reporter | ||
Comment 22•12 years ago
|
||
Attachment #634182 -
Attachment is obsolete: true
Attachment #634182 -
Flags: review?(jmaher)
Attachment #634228 -
Flags: review?(jmaher)
Reporter | ||
Comment 23•12 years ago
|
||
Attachment #634228 -
Attachment is obsolete: true
Attachment #634228 -
Flags: review?(jmaher)
Attachment #634229 -
Flags: review?(jmaher)
Reporter | ||
Comment 24•12 years ago
|
||
Attachment #634229 -
Attachment is obsolete: true
Attachment #634229 -
Flags: review?(jmaher)
Attachment #634289 -
Flags: review?(jmaher)
Comment 25•12 years ago
|
||
Comment on attachment 634229 [details] [diff] [review] i didn't actually mean to delete the README Review of attachment 634229 [details] [diff] [review]: ----------------------------------------------------------------- since there is minor updates coming, I am going to stop my review and just send these comments along. ::: talos/output.py @@ +180,5 @@ > + """returns if the test is a responsiveness test""" > + # XXX currently this just looks for the string > + # 'responsiveness' in the test name. > + # It would be nice to be more declarative about this > + return 'responsiveness' in testname should this be in the graphserver stuff or in a general place? @@ +198,5 @@ > + if responsiveness: > + _type = 'AVERAGE' > + elif self.results.amo: > + _type = 'AMO' > + info_format = self.amo_info_format can this elif clause go in the amo class below? @@ +277,5 @@ > + for line in lines: > + if not line: > + continue > + if line.lower() in ('success',): > + print 'RETURN:addon results inserted successfully' can this go in the amoresults class below? @@ +416,5 @@ > +try: > + from amo.amo_api import upload_amo_results, amo_results_data > + # depends on httplib2 and json/simplejson so we conditionally import it > + > + class AMOOutput(Output): can this be a subclass of graphserver so we can just overload the functionality that changes for amo?
Reporter | ||
Comment 26•12 years ago
|
||
> > + """returns if the test is a responsiveness test"""
> > + # XXX currently this just looks for the string
> > + # 'responsiveness' in the test name.
> > + # It would be nice to be more declarative about this
> > + return 'responsiveness' in testname
>
> should this be in the graphserver stuff or in a general place?
For this patch, I don't really know. It can go on the generic output class....I put it on the graphserver output class as it is only used there. Really, its a horrible hack. In the long long term, I would rather have test objects that actually knew if they were responsiveness tests vs some introspection of the name (vs the current test_config dictionary), but not important now
Reporter | ||
Comment 27•12 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #25) > Comment on attachment 634229 [details] [diff] [review] <snip/> > can this go in the amoresults class below? > > @@ +416,5 @@ > > +try: > > + from amo.amo_api import upload_amo_results, amo_results_data > > + # depends on httplib2 and json/simplejson so we conditionally import it > > + > > + class AMOOutput(Output): > > can this be a subclass of graphserver so we can just overload the > functionality that changes for amo? With regard to the amo issues, it is a bit complicated as when amo is set we do two things currently: 1. we upload our results to whatever graphserver URLs we have (i.e. results_urls) but in a different format: https://wiki.mozilla.org/Buildbot/Talos/DataFormat#AMO 2. we also upload results via simplejson encoding in a different format as given in amo.amo_api (formerly to a hardcoded URL, though I have un-hard-coded it here) Since 'amo' is a global flag, and since the issues you point to here are in the graphserver code, I'm not sure what is best here if we want to preserve existing functionality. (resonsiveness is similarly handled, fwiw). An alternative would be to have a URL class (a new format in output.py), say 'amo_graphserver_url', and a new class AMOGraphserverOutput that inherits from GraphserverOutput, and have config['results_urls'] changed to config['amo_graphserver_urls'] in run_test.py . This will still leave responsiveness special-cased. If this is clearer, I can do it this way. I also don't mind deprecating amo uploading to graphserver since its not being used. If in the future we'll want to upload to datazilla, this is also not covered by existing output semantics
Reporter | ||
Comment 28•12 years ago
|
||
Running on try. Will flag for review on green
Attachment #634289 -
Attachment is obsolete: true
Attachment #634289 -
Flags: review?(jmaher)
Reporter | ||
Comment 29•12 years ago
|
||
Comment on attachment 634467 [details] [diff] [review] more fixes pushed to try: https://tbpl.mozilla.org/?tree=Try&rev=a9113137937a Its green enough that I'm going to put this up for review and hope nothing else falls out
Attachment #634467 -
Flags: review?(jmaher)
Reporter | ||
Comment 30•12 years ago
|
||
This patch also doesn't get autolog results for xperf reporting: http://hg.mozilla.org/build/talos/file/170c100911b6/talos/xtalos/mozautolog.py It would be nice to bring that under this umbrella, but it will be non-trivial
Comment 31•12 years ago
|
||
Comment on attachment 634467 [details] [diff] [review] more fixes Review of attachment 634467 [details] [diff] [review]: ----------------------------------------------------------------- as I have sort of reviewed this 3 times before there is not much to add. It actually is starting to make sense:) ::: talos/output.py @@ +86,5 @@ > + return bool([i for i in memory_metric if i in resultName]) > + > + @classmethod > + def responsiveness_Metric(cls, val_list): > + return round(sum([int(x)*int(x) / 1000000.0 for x in val_list])) someday this should go in the filters :) ::: talos/run_tests.py @@ +269,5 @@ > + > + # results links > + outputs = ['results_urls', 'datazilla_urls'] > + results_urls = dict([(key, config[key]) for key in outputs > + if key in config]) I like this section!
Attachment #634467 -
Flags: review?(jmaher) → review+
Reporter | ||
Comment 32•12 years ago
|
||
pushed: http://hg.mozilla.org/build/talos/rev/45b903369dca > > + def responsiveness_Metric(cls, val_list): > > + return round(sum([int(x)*int(x) / 1000000.0 for x in val_list])) > > someday this should go in the filters :) TBH, i'm not even sure what this means o_O
Reporter | ||
Updated•12 years ago
|
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 33•12 years ago
|
||
(In reply to Jeff Hammel [:jhammel] from comment #32) > pushed: http://hg.mozilla.org/build/talos/rev/45b903369dca > > > > + def responsiveness_Metric(cls, val_list): > > > + return round(sum([int(x)*int(x) / 1000000.0 for x in val_list])) > > > > someday this should go in the filters :) Ticketed : bug 766598 > TBH, i'm not even sure what this means o_O
You need to log in
before you can comment on or make changes to this bug.
Description
•