Closed Bug 909735 (Ouija) Opened 11 years ago Closed 6 years ago

Develop Ouija, a utility for failure rate analysis

Categories

(Testing :: General, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: dminor, Assigned: gbrown)

References

Details

Some initial work has put together Ouija (currently located at http://54.215.155.53/) a utility for failure rate analysis. To start with, we're interested in tracking failures by product, by test type, and by individual test machines.

The repository is located at: https://github.com/dminor/ouija
Depends on: 909739
Depends on: 909798
Depends on: 909799
Depends on: 909831
The script at https://github.com/dminor/ouija/blob/master/src/updatedb.py hasn't be an chance been running heavily against production TBPL the last few weeks? Just wondering if it was the cause of bug 897903, which has been causing issues with sheriffs using TBPL... :-s
Ed, I've been running it once a week to get data for my Wednesday testing meeting, typically just to grab logs from mozilla-central. If all of the problems occur around 10 AM EST, then you can blame me.
Ah once a week is ok - I didn't know if we were running it on a 5 min cron or something :-)
More seriously, do you forsee problems if I were to set a cron job to run daily to grab the previous day's data? Right now I'm running weekly to grab the previous week's data, so it shouldn't be that much worse.
No that should be fine :-)
would there be a concern with running this update script more frequently?  say every 4 hours?  Once a day is adequate, 5 minutes would be awesome, but overkill.
Roughly how many calls to getRevisionBuilds.php does it do on each run?
Also, is there a reason we can't host this somewhere and give it direct access to the TBPL DB, to save having to poll and store somewhere else?
Since we're calculating summary data for around a week's worth of commits, I think it ends up being less of a burden on tbpl to poll and store elsewhere rather than fetching the required data every time we generate a page.

I could be wrong about that, considering we are polling data for every platform, but only presenting data for android2.2 and android4.0 at the moment.

jmaher may have other ideas, but I was thinking we would just drop data older than a month or two to prevent excessive storage.
We could still cache the data Ouija (or even an additional TBPL table), and just run the query every N hours :-)
(Accessing the DB directly would avoid a query with lots of joins x every push x every tree for the last N days)
I think hosting it and accessing the tbpl db would be great.  For now, a daily cron job would have limited impact and allow us to develop queries and useful views on the data.  

I can easily see us looking at weekly trends over a quarter or previous 12 week window.  having data >6 months old, not really useful.  This is intended to point out current trends in automation, not necessarily historical trends.

Maybe when tree-herder is released we could talk about adding some of these features into it.  I really like the idea of having something that doesn't require VPN access and somebody can hack locally.  This is a really easy project to onboard with!
Sounds good to me :-)

(And as for treeherder, much of the features I understand Ouija to offer, are either already on the cards, or at the least we've incorporated them into the design spec for schema etc).
Depends on: 922140
Depends on: 1003923
Depends on: 1004454
Depends on: 1004539
Depends on: 1009016
Depends on: 1009198
Depends on: 1009205
Depends on: 1010465
Alias: Ouija
Depends on: 1024934
Depends on: 977219
Depends on: 936610
Depends on: 936605
Depends on: 936603
Depends on: 1024936
Depends on: 1027063
Depends on: 1029517
Depends on: 1030076
Depends on: 1047443
Depends on: 1087532
Depends on: 1110775
Depends on: 1287128
Assignee: nobody → gbrown
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.