Closed Bug 728461 Opened 12 years ago Closed 12 years ago

make a package to analyze Talos statistics locally

Categories

(Testing :: Talos, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: k0scist, Unassigned)

References

Details

(Whiteboard: [jetpack+talos][SfN])

We want to move all statistical analysis out of talos: bug 721902

However, we will still have the need to analyze things locally,
preferably in a better way than we do it currently.  A python package
should be written to this end that utilizes the same backend that
graphserver will use, bug 728442

A very rudimentary thing is at http://k0s.org/mozilla/hg/jetperf/file/tip/jetperf/compare.py
Depends on: 728442
Whiteboard: [jetpack+talos][SfN]
Blocks: 729205
So we need a name for this.....'compare.py' is woefully bad.

Make AnalyzeTalos? any thoughts, :jmaher?
PerfAnalyzer
TalosFilter

I really really prefer:
AnalyzeThis
For the record I love AnalyzeThis :)
I have a very rudimentary strawman at http://k0s.org/mozilla/hg/AnalyzeThis . All it does is currently apply some files to talos results files and, if specified, makes a graph.  

:jmaher (and whoever else is interested), maybe you can give feedback on this and provide suggestions for what more we want to do? Either here or #ateam or in vidyo (maybe the SfN meeting) is fine.

There are a lot of questions to be answered, but its a starting place
:jmaher, anyone, any interest/ETA on this?

see also https://bugzilla.mozilla.org/show_bug.cgi?id=729205 ; this and that are interrelated
See Also: → 729205
I spent some time on this today and overall I like it.  I couldn't get the install.py or setup.py to work.  

What I would like to see is:
 * what we use for the graph server is default
 * ability to query a database to get historical data (maybe the public graph server)
 ** when I graph something the historical data is nice as long as my current data point can be referenced.
 ** note: this is a slippery slope as environments are different
 * replace the csv output in talos with this.
 * integrate the filters.py stuff from talos into this.

Basically this is an awesome tool and a great start to getting things done.
I like the overall concept of splitting the statistics out into its own module and you have clearly found an awesome name for a package!  

1.) Does this module need to manage presentation?  

In http://k0s.org/mozilla/hg/AnalyzeThis/file/fa470e931ce0/analyzethis/graph.py there is hard coded html that uses python's string formatting operator to build the html graph.  Is this a requirement of the module?  This type of presentation function might be better placed in a webservice that calls a function in AnalyzeThis to get a graph object.  I might not be understanding the purpose fully.  If for some reason you need this standalone module to build html, consider placing all of the html in a dedicated directory in your repository and using a templating system.  There are a variety of reasons why you don't want to hard code presentation into a standalone server-side module.  

How would I use the results from AnalyzeThis in a D3.js or processing.js visualization? 
What if I want to include a graph on a page with other content, the doctype declaration and body tags included in the html that AnalyzeThis generates would make this tough.  

AnalyzeThis could return either a data serialization format (json) or a python datastructure, let the caller request the data structure return format and also let the caller manage a final presentation format.  The webservice would be a caller and could serve json or html, depending on the client's needs, then we're not locked into one form of presentation.  For a client that just wants a graph and does not want to mess with javascript we could have some specialized html templates that the webservice returns.  We could have a variety of pre-built html templates, so the final destination client has a variety of options for out-of-the-box functionality.

2.) I think AnalyzeThis could interact with the database through the Model.py module I'm working on to retrieve data.  It could also have some specialized model methods that it uses in Model.py to store statistics results where appropriate.  Maybe we could discuss in detail in the work week.  It could also just pull data through the webservice itself, this might be more flexible because you will not have machine specific access requirements to deal with from the database.
(In reply to Joel Maher (:jmaher) from comment #6)
> I spent some time on this today and overall I like it.  I couldn't get the
> install.py or setup.py to work.  
> 
> What I would like to see is:
>  * what we use for the graph server is default
>  * ability to query a database to get historical data (maybe the public
> graph server)
>  ** when I graph something the historical data is nice as long as my current
> data point can be referenced.
>  ** note: this is a slippery slope as environments are different
>  * replace the csv output in talos with this.
>  * integrate the filters.py stuff from talos into this.
> 
> Basically this is an awesome tool and a great start to getting things done.

So I tackled this package even though bug 728442 is a blocking requirement: clearly we need some separation here, but I'm not sure what it is.  We kinda need to get filters out of Talos before any of this is finalized.  OTOH, I need this for jetperf kinda nowish in some form, so I'm doing the usual dance of iterating towards a solution whose requirements can't be satisfied in the logical order ;) 

Disclaimer aside, I'm not sure how the specifics would work:

>  * what we use for the graph server is default

IMHO this should go in the package that both graphserver and AnalyzeThis share

>  * ability to query a database to get historical data (maybe the public
> graph server)

Likewise

>  ** when I graph something the historical data is nice as long as my current
> data point can be referenced.
>  ** note: this is a slippery slope as environments are different

I have no idea what to do with these two bullet points.

>  * replace the csv output in talos with this.

I'm kinda thinking csv should just go away. More generally, there should be an interface that takes "whatever we're calling results" that can serialize to a format.  Our current weird format could be one of these, or json, or csv, etc.  You could also have these (optionally) implement a loader for deserialization

>  * integrate the filters.py stuff from talos into this.

filters.py should get moved out of talos and into the upstream package that both graphserver and AnalyzeThis use
(In reply to Jonathan Eads ( :jeads ) from comment #7)
> I like the overall concept of splitting the statistics out into its own
> module and you have clearly found an awesome name for a package!  
> 
> 1.) Does this module need to manage presentation?  

Comments inline, but yes, the point of this package is presentation.  Calculation should live upstream and to the extent that it doesn't it should.  As said, this is a WIP and the ultimate goal is to have another package (see bug 728442 and bug 729205) that has the shared "stuff" that both AnalyzeThis and graphserver need.  Its a bit of a juggling act, at the moment, and I'll work on that upstream package shortly.

> In
> http://k0s.org/mozilla/hg/AnalyzeThis/file/fa470e931ce0/analyzethis/graph.py
> there is hard coded html that uses python's string formatting operator to
> build the html graph.  Is this a requirement of the module?  

It is a loose requirement to be able to make a graph.  Making an HTML graph in flot seemed the way to do so with the least dependencies and most functionality.  It also might be nice for some of this to live upstream.  So if you're running talos locally you get some results out.  What to do? IMHO, it is much easier to make a local graph that you can portably send, etc, though there are some options here.  Ideally, it would require no server calls, etc.

> This type of
> presentation function might be better placed in a webservice that calls a
> function in AnalyzeThis to get a graph object.  I might not be understanding
> the purpose fully.  If for some reason you need this standalone module to
> build html, consider placing all of the html in a dedicated directory in
> your repository and using a templating system.  There are a variety of
> reasons why you don't want to hard code presentation into a standalone
> server-side module.  

This isn't a server-side module;  see above.  Probably the best thing to do is to download the js I need and have it locally and render it directly into <script> tags, etc as one of my goals is to have a graph.html file you can send around as a single file without touching net (which it currently doesn't since I was lazy).

> How would I use the results from AnalyzeThis in a D3.js or processing.js
> visualization? 

You wouldn't, see above

> What if I want to include a graph on a page with other content, the doctype
> declaration and body tags included in the html that AnalyzeThis generates
> would make this tough.  

I'm certainly open for fixes there.  I would like a pretty minimal graph.html output that is immediately viewable.  I'm not opposed to templates at all (I usually use tempita for them in python these days mostly since they seem to be least objectionable to all involved) but wanted to avoid a dependency for round #1. The point of the graph isn't to be included OOTB elsewhere -- it would be nice if as much as possible were upstreamed to something both graphserver and AnalyzeThis could use -- but to be directly viewable by users.

> AnalyzeThis could return either a data serialization format (json) or a
> python datastructure, let the caller request the data structure return
> format and also let the caller manage a final presentation format.  

I'm not opposed to outputting JSON too.

> The
> webservice would be a caller and could serve json or html, depending on the
> client's needs, then we're not locked into one form of presentation.  For a
> client that just wants a graph and does not want to mess with javascript we
> could have some specialized html templates that the webservice returns.  We
> could have a variety of pre-built html templates, so the final destination
> client has a variety of options for out-of-the-box functionality.


> 2.) I think AnalyzeThis could interact with the database through the
> Model.py module I'm working on to retrieve data.  It could also have some
> specialized model methods that it uses in Model.py to store statistics
> results where appropriate.  Maybe we could discuss in detail in the work
> week.  It could also just pull data through the webservice itself, this
> might be more flexible because you will not have machine specific access
> requirements to deal with from the database.

So I think we need some serious thought about what goes in 

* datazilla
* AnalyzeThis
* (some package that they both depend on)

I think we all agree that Talos should do as little statistics as possible (like none, preferably).

I would like as much as possible of shared code to live in a package that both AnalyzeThis and graphserver would use.  This will definitely include

* statistics

and could also include

* RESTful API stuff such that AnalyzeThis could fetch data (or even upload data) from graphserver
* graph generation stuff

I think the closer we get the more things will converge into something sensible.

Per context of "where is this bug now?" I need some of this functionality for jetperf, which is why I started on this before the blocking mission of getting statistics out of Talos (which is also partially blocked by bug 742824).
> This isn't a server-side module;  see above.

Ahhh, clearly I did not fully understand the requirements.  Thanks for the clarification!

One option to consider for local graph functionality would be to write a javascript app that uses the HTML5 File API to load json that AnalyzeThis exports locally.  The UI could then be served giving you more flexibility in providing UI analysis support, like loading json from multiple tests, pulling in historical data to include in the comparison, you could customize this to a developers needs and not have to deal with getting everyone to update AnalyzeThis when you make improvements.  If the module serves json, it would also be straight forward to use it in datazilla.
It was decided in other media that we don't have enough of a picture to know what we want here right now.  I'm going to close until we have some better idea, in which case we can probably file more actionable bugs
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.