Closed Bug 1179316 Opened 9 years ago Closed 9 years ago

Make OrangeFactor show data for all job types, including builds & those from Taskcluster

Categories

(Tree Management Graveyard :: OrangeFactor, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: emorley, Assigned: emorley)

References

Details

Attachments

(3 files)

Doesn't OF do something about only showing things for which it has parsed the log, and thus also doesn't show any data about intermittents on build jobs because it doesn't parse build logs?
OF no longer uses anything from the logs as of bug 1164260. (The logparser service has to either be moved to AWS or decommed due to PHX decom, but we realised we could just move to using oranges per push as a metric, since the prior metric was just as unreliable due to the way it used build ids).
https://bugzilla.mozilla.org/show_bug.cgi?id=1115253 versus https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1115253 (only intermittent on build jobs I could think of offhand), so maybe there's still leftovers making it think it should only consider certain sorts of jobs even though it doesn't look at the logs for any jobs.
Depends on: 1179494
Depends on: 1179529
Since (a) we don't use TBPL any more, and (b) the data comes via ElasticSearch.
Attachment #8628578 - Flags: review?(jgriffin)
Since it has nothing to do with the pushlog.
Attachment #8628579 - Flags: review?(jgriffin)
Attachment #8628579 - Attachment is patch: true
Now that we're no longer relying on logparser for the calculation of total job runs (which did not support builds and other non-standard job types), we can stop blacklisting failures that occurred on builds. This also caught taskcluster jobs, since their fake buildername contained the string "non-buildbot" (typical!).

In addition, there's no need to skip the failure if the buildername does not contain "test" or "talos" - we can fall back to the "type" field, which is the longer description-esque name, or failing that, call the type "unknown". The testsuite name is only used for the UI and to match when filtering, so the value isn't overly important anyway.
Attachment #8628580 - Flags: review?(jgriffin)
Example ES document for a case where we have to fall back to "type":

{

    "_index": "bugs",
    "_type": "bug_info",
    "_id": "zMaG7tUVSUGbRD2-4nOBtg",
    "_version": 1,
    "_score": null,
    "_source": {
        "timestamp": "1416207558",
        "os": "b2g-device-image",
        "date": "2014-11-16",
        "who": "tomcat@mozilla.com",
        "buildtype": "opt",
        "tree": "b2g-inbound",
        "machinename": "b-linux64-ix-0009",
        "bug": "1007689",
        "buildname": "b2g_b2g-inbound_flame-kk_eng_dep",
        "type": "Flame KitKat Device Image Build (Engineering)",
        "logfile": "00000000",
        "starttime": "1416204515",
        "rev": "43d6148d629f"
    },
    "sort": [
        "0009"
        ,
        "kitkat"
    ]

}
Summary: Determine why OrangeFactor doesn't show data for bug 1165469 → Make OrangeFactor show data for all job types, including builds & those from Taskcluster
Blocks: 1179310
No longer blocks: 1179308
Attachment #8628578 - Flags: review?(jgriffin) → review+
Comment on attachment 8628579 [details] [diff] [review]
Part 2: Rename getPushLogData to getElasticSearchData

Review of attachment 8628579 [details] [diff] [review]:
-----------------------------------------------------------------

Splinter has a big 'WINDOWS PATCH' warning for this patch; you might want to check line endings to make sure nothing weird has crept in.
Attachment #8628579 - Flags: review?(jgriffin) → review+
Comment on attachment 8628580 [details] [diff] [review]
Part 3: Show data for all job types, including builds and Taskcluster

Review of attachment 8628580 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for all this!
Attachment #8628580 - Flags: review?(jgriffin) → review+
(In reply to Jonathan Griffin (:jgriffin) from comment #9)
> Splinter has a big 'WINDOWS PATCH' warning for this patch; you might want to
> check line endings to make sure nothing weird has crept in.

Ah I just did a copy/paste of the diff into the bugzilla "paste as attachment" thingy, since I couldn't face figuring out how to get bzexport to work with a bookmark based workflow. The commits locally do not have any windows file endings :-)
https://hg.mozilla.org/automation/orangefactor/rev/a2c3fcd63c0a
https://hg.mozilla.org/automation/orangefactor/rev/a668ae447cd9
https://hg.mozilla.org/automation/orangefactor/rev/699a6bc3f298

[root@brasstacks1.dmz.scl3 ~]# su mcote
[mcote@brasstacks1.dmz.scl3 root]$ cd ~/orangefactor/src/orangefactor/
[mcote@brasstacks1.dmz.scl3 orangefactor]$ hg pull -u -v
pulling from http://hg.mozilla.org/automation/orangefactor
searching for changes
all local heads known remotely
adding changesets
adding manifests
adding file changes
added 8 changesets with 16 changes to 6 files
resolving manifests
getting sendemail.py
getting server/handlers.py
getting server/mozmiddleware.py
getting server/update_testfailures.py
getting server/woo_server.py
getting woo_mailer.py
6 files updated, 0 files merged, 0 files removed, 0 files unresolved
[mcote@brasstacks1.dmz.scl3 orangefactor]$ exit
[root@brasstacks1.dmz.scl3 ~]# su webtools
[webtools@brasstacks1.dmz.scl3 root]$ cd ~/apps/orangefactor/src/orangefactor/
[webtools@brasstacks1.dmz.scl3 orangefactor]$ hg pull -u -v
pulling from http://hg.mozilla.org/automation/orangefactor/
searching for changes
all local heads known remotely
adding changesets
adding manifests
adding file changes
added 8 changesets with 16 changes to 6 files
resolving manifests
getting sendemail.py
getting server/handlers.py
getting server/mozmiddleware.py
getting server/update_testfailures.py
getting server/woo_server.py
getting woo_mailer.py
6 files updated, 0 files merged, 0 files removed, 0 files unresolved
[webtools@brasstacks1.dmz.scl3 orangefactor]$ exit
[root@brasstacks1.dmz.scl3 ~]# /etc/init.d/orangefactor stop; /etc/init.d/orangefactor start; /etc/init.d/nginx reload
stopping orangefactor                                      [  OK  ]
starting orangefactorspawn-fcgi: child spawned successfully: PID: 30838
                                                           [  OK  ]
Reloading nginx:                                           [  OK  ]


These two now work:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1165469
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1115253
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
No longer depends on: 1179529
Resolution: --- → FIXED
CCing Sheriffs: Hopefully all bugs should now work in OrangeFactor - but should you happen to see a case where we get "Sorry have no data for this bug" - please let me know :-)
Wow - comparing the last "War on Orange" email with what OrangeFactor now reports for that timespan:
https://brasstacks.mozilla.com/orangefactor/?display=OrangeFactor&tree=trunk&startday=2015-06-23&endday=2015-06-29

...shows the orangefactor has increased from 6.2 to 12.5 ! (Note: the fix in this bug makes all old data show up, not just submissions from now onwards).

The top three failures on OF were not even in the email :-s
Half the test failures are on b2g desktop? I'll buy that.
Product: Tree Management → Tree Management Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: