Closed
Bug 1264074
Opened 9 years ago
Closed 8 years ago
Use Pulse for creation of Github resultsets
Categories
(Tree Management :: Treeherder: Data Ingestion, defect, P2)
Tree Management
Treeherder: Data Ingestion
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: camd, Assigned: camd)
References
Details
Attachments
(3 files)
Currently we only handle auto-resultset generation for HG repos. We should also do this for github repos like gaia
Task Cluster is going this creation based on a github webhook that gives them the revisions. We'll need to decide what information each job must pass in so that we can fetch the list of revisions from github.
For now, we can leave task cluster's resultset creation in place and tackle this later (possibly Q3 2016?).
Comment 1•9 years ago
|
||
This could be useful for Servo as well, which is based on github and is considering submitting performance data to treeherder (but not via taskcluster): https://github.com/servo/servo/issues/10452.
Comment 2•9 years ago
|
||
I believe this blocks/depends on a couple of bugs, could you set the deps? :-)
Comment 3•9 years ago
|
||
This is also interesting for the WebDev folks, it could save them a bunch of work on their end.
Maybe we should consider bumping up the priority of this-- if there are at least 3 groups of people (Taskcluster, Servo, WebDev) who could benefit from this, it might be worth doing sooner than later.
Assignee | ||
Comment 4•9 years ago
|
||
We discussed in the meeting today that we could use github webhooks to tell us when a new resultset should be created: https://developer.github.com/webhooks/
The events I'm thinking would apply here are:
1. pull_request
- Any time a Pull Request is assigned, unassigned, labeled, unlabeled, opened, edited, closed, reopened, or synchronized (updated due to a new push in the branch that the pull request is tracking).
2. push
- Any Git push to a Repository, including editing tags or branches. Commits via API actions that update references are also counted. This is the default event.
Not ALL the pull_request events would trigger this, but a few at least.
However, now that I think a bit more about this, I don't think it should just call our API directly. I think it should write the info to a Pulse exchange. That way, any Treeherder instance can subscribe to this (even locally) and get the info.
Assignee | ||
Comment 5•9 years ago
|
||
wrt the pulse exchange: I'd love to avoid having a service that the GH webhook talks to that then publishes to Pulse. Hopefully the webhook itself would be able to do all that itself without us needing to host yet another service. :) But I have never created a webhook, so not sure of the limitations yet.
Assignee | ||
Comment 6•9 years ago
|
||
Task cluster has a webhook called taskcluster-github that posts all github pushes and PR changes to a pulse exchange. Treeherder could subscribe to this exchange to get the info for new resultsets.
This project does not yet put the revisions into the pulse messages, but we can hopefully convince the owners to do so. :)
Assignee | ||
Comment 7•9 years ago
|
||
Sounds like jonasfj thinks adding the revisions to those messages is a good idea. So the next part of the project would be to have treeherder subscribe a channel to that exchange for github pushes.
I hope to jump on this in the second half of Q2, once I get us ingesting jobs via pulse from task cluster for my Q2 deliverable.
Assignee | ||
Updated•9 years ago
|
Assignee: nobody → cdawson
Assignee | ||
Comment 8•9 years ago
|
||
Assignee | ||
Comment 9•9 years ago
|
||
Assignee | ||
Comment 10•9 years ago
|
||
Assignee | ||
Updated•9 years ago
|
Summary: Handle autocreation of resultsets for github repos → Use Pulse for creation of Github resultsets
Assignee | ||
Comment 11•9 years ago
|
||
Comment 12•9 years ago
|
||
Assignee | ||
Updated•9 years ago
|
Priority: -- → P2
Assignee | ||
Comment 13•9 years ago
|
||
Comment on attachment 8771203 [details] [review]
[treeherder] mozilla:github-pulse-resultsets > mozilla:master
Hey Ed: This is a pretty big one. If you'd like me to walk through it with you, I'm happy to do it. :) It should look fairly familiar from the pulse jobs PR though. Thanks!!
Attachment #8771203 -
Flags: review?(emorley)
Updated•9 years ago
|
Attachment #8771203 -
Flags: review?(emorley) → review+
Comment 14•9 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/b2e5e714aab359c3dded9fc66af50cf27f261394
Bug 1264074 - Use Pulse for creation of Github resultsets (#1692)
* Bug 1264074 - Move to_timestamp function to a reusable location
* Bug 1264074 - Refactor JobConsumer to have a PulseConsumer super class
Much of what was in the JobConsumer is reusable by the upcoming
ResultsetConsumer. So refactor those parts out so that each specific
consumer can reuse code as much as possible.
* Bug 1264074 - Add ability to ingest Github Resultsets via Pulse
This introduces a ResultsetConsumer and a read_pulse_resultsets
management command to ingest resultsets from the TaskCluster
github exchanges.
When a supported Github repo has a Pull Request created or
updated, or a push is made to master, then it will kick off a
Pulse message. We will receive it and then fetch any additional
information we need from github's API and store the Resultset.
This follows a very similar pattern to the Job Pulse ingestion.
* Bug 1264074 - Old code/comments cleanup
* Bug 1264074 - Tests for the Github resultset pulse loader
Comment 15•8 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/a236e69474a7afb4e1458833de8d5b28466d9972
Bug 1264074 - Add execute permission to run_read_pulse_resultsets
Comment 16•8 years ago
|
||
Ah I see now why the Heroku deploy is failing - PULSE_RESULTSET_SOURCES was set to invalid json, but the compile step seems to then cache it, even though (a) the release failed, (b) `heroku config` insists it's not set (lies). Worse, it's not possible to unset it.
I've filed:
https://help.heroku.com/tickets/394841
The workaround is just to set `PULSE_RESULTSET_SOURCES` to valid json, since setting to a new value works, even if unsettting doesn't. (done now)
Comment 17•8 years ago
|
||
(In reply to Ed Morley [:emorley] from comment #16)
> I've filed:
> https://help.heroku.com/tickets/394841
Heroku have now fixed this :-)
Assignee | ||
Comment 18•8 years ago
|
||
This feature is now fixed, even if a few repos are not reporting to Pulse yet. I'll follow up with them or in separate bugs, if need be.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•