Closed Bug 1225246 Opened 10 years ago Closed 8 years ago

Support web hooks

Categories

(Taskcluster :: Services, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dustin, Assigned: hammad13060, Mentored)

Details

The hooks API should support GitHub webhooks (and any other service that can POST a payload to a URL). There's some support for this already in the service, just commented out.
Assignee: dustin → nobody
Mentor: dustin
No longer blocks: 1286979
Assignee: nobody → hammad13060
Hi Jonas, thanks for vouching for me. I have taskcluster credentials. I cloned the repo `taskcluster-hooks` and ran `npm install` and later `npm test`. Tests related to influx seem to fail, but I think that's fine since you said that influx isn't important right now. It will be great if you could get me started hacking :)
Flags: needinfo?(jopsen)
> for supporting Github webhooks I will probably need to make changes to `routes/v1.hs` for configuring the hooks > lets say I want to configure `push` event. So I will add following code (assuming above statement is true) api.declare({ method: 'post', route: '/push', name: 'PushEvent', deferAuth: true, scopes: undefined, input: undefined, output: undefined, title: 'Push event on a repository branch', stability: undefined, description: [ "Triggered when a repository branch is pushed to. In addition to branch pushes, webhook push", "events are also triggered when repository tags are pushed." ].join('\n') }, async function(req, res) { var payload = req.body; // some code for calling a service // which will consume payload }) > So if this is what has to be done, my doubt is which service will consume the github `payload`. Some pointers for this will do. > It will be great if you can share `taskcluster-base` docs so that I can configure the hooks with correct options.
taskcluster-lib-... all have good documentation comments in the source. Some even have reasonable READMEs :) Note, I suspect this will have to discard the payload, as I don't see a good way of passing it into the task. Notably, it would be important that the end-point is protected by scopes. Check out the documentation comments for api.declare in taskcluster-lib-api And take a look at some other examples, taskcluster-queue is always good, but tc-hooks also have protected APIs. @Hammad, I think the thing that is uncommented is: https://github.com/taskcluster/taskcluster-hooks/blob/a47c8a4df2e9f21b2043601779b579e3307ba0fe/routes/v1.js#L415 A good start would be to understand how that works, and then seeing if we can make it work. And what possible security implications it might have... Maybe it needs tests, it probably could use some additional documentation. I suspect it was disabled because we didn't really need it, didn't have tests and wasn't fully sure we had done it right. It's also possible that it could benefit from some debouncing. https://github.com/taskcluster/taskcluster-hooks/blob/a47c8a4df2e9f21b2043601779b579e3307ba0fe/test/api_test.js#L282 Definitely getting the test to pass is probably a good idea. Also reviewing them and making sure we have tests that checks that an invalid token won't work :) @Hammad, let me know how far you get :)
Flags: needinfo?(jopsen) → needinfo?(hammad13060)
(In reply to Jonas Finnemann Jensen (:jonasfj) from comment #3) > Note, I suspect this will have to discard the payload, as I don't see a good > way of passing it into the > task. For now, yes. I think eventually we will want to make this payload available via template substitutions (but as a separate project)
@jonas > SCOPES * I think of scopes as permissions on what a user can/can't do, hence they protect the API's against unauthorized users from. Am I right ? * How do we decide which scopes are required for an API. > API as far as I understand API works as follows: > method: 'get' > route: '/hooks/:hookGroupId/:hookId/token - this endpoint simply retrieves an azure entity using `hookGroupId` as partition key and hookId as row key. - if a valid entity is found a json is returned containing trigger token. > method: 'post' > route: '/hooks/:hookGroupId/:hookId/token' - this endpoint retrieves an azure entity using `hookGroupId` as partition key and hookId as row key. - trigger token is modified and updated. - new trigger token is returned in a json. > method: 'post', > route: '/hooks/:hookGroupId/:hookId/trigger/:token' > input(payload): 'trigger-payload.json' (serves as payload) - this endpoint triggers a task using the payload and trigger token. - trigger token is validated and then `taskcreator.fire(hook, payload)` is called. - on inspecting the TaskCreator::fire(hook, payload), I found that - A Queue client is instantiated. - Instantiated queue client is used for creating task using azure entity but payload is ignored. * I think first two end points have to be left alone. In the third end point we have to make use of payload. Is it possible for queue client to consume the payload of the 'post' request ? > API Tests - Tests seem to validate only last two end points i.e. - Reset trigger token api - creates a token - resets it - trigger task api - fail with invalid token - trigger task with given payload - I think that these tests are fine for the api and these implicitly test the first end point as well. - now when I tried running `npm test` but tests didn't complete and code got stuck in some kinf of a loop. > taskcluster-hooks@0.0.1 test /Users/hammad13060/mozilla/taskcluster-hooks > ./test/runtests.sh > API > createHook > ✓ creates a hook (102ms) > ✓ with invalid scopes > ✓ succeeds if a matching resource already exists > ✓ fails if different resource already exists > ✓ creates associated group > ✓ without a schedule - now nothing else happens. process keeps running @jonas and dustin > Github Webhooks - payload of the github webhooks contains all the meaningful information about the event. * How will we utilize the webhooks without the payload ? Are they of any use without the payload ?
Flags: needinfo?(hammad13060) → needinfo?(jopsen)
You're right that the content of the webhooks will be important, but they are not useless without it. There are lots of simple cases where a `curl` of the right URL is all that's necessary to kick off some kind of task. This bug is not just for github webhooks (we have the taskcluster-github service for those). Rather, it's to support any service that can implement webhooks. For example, AWS Simple Notification Service can be configured to deliver notifications to webhooks, so a user might use that to link a deployment in AWS to a post-deployment test run in TaskCluster. http://docs.aws.amazon.com/sns/latest/dg/SendMessageToHttp.html We would want to support other options, too -- that's just one example. So building something generic enough to support any kind of webhook payload (maybe even XML??) will be a big project and needs a bit more thought. But getting webhooks working at all -- including setup of tokens -- is a good starting point. By the way, Jonas, we don't want the webhook triggering endpoint to require scopes -- it is guarded with the triggerToken instead.
@Dustin okay so the main idea is that our service will implement webhooks for multiple services, each one of which will have different type of payloads. We need a mechanism to digest all those diverse payloads. So initially you want me to implement some webhooks and get them up and running, even though they may not do anything for now(at least). Should I start with setting up webhooks for github atleast then we can see how things go.
Close -- it's up to our *users* what services end up calling these webhooks, so we need to design something generic that any service can talk to. So for the moment, we'll get webhooks up and running without any handling of the webhook payload. Probably the easiest way to test is not with github, but by running something like curl -d '{}' https://hooks.taskcluster.net/v1/hooks/:hookGroupId/:hookId/trigger/:token
> @dustin, so I will setup endpoints in the following way api.declare({ method: 'post', route: '/push', name: 'PushEvent', deferAuth: true, scopes: [some scopes], input: 'push-payload.json', output: 'task-status.json', title: 'Push event on a repository branch', stability: undefined, description: [ "Triggered when a repository branch is pushed to. In addition to branch pushes, webhook push", "events are also triggered when repository tags are pushed." ].join('\n') }, async function(req, res) { var payload = req.body; // may do something in the future }); * One more doubt, how to decide on scopes for an API ? - One more clarification. We are defining webhooks with event names same as that defined by github but they may be triggered by some other service as well.
No, sorry. Please forget completely about Github webhooks. The webhook functionality you build should target teh curl command I mentioned in comment 8.
@dustin The webhook is already setup. Should I concentrate on digesting the payload part ??
@hammad > * I think of scopes as permissions on what a user can/can't do, hence they protect the API's against unauthorized users from. Am I right ? True, @dustin, > By the way, Jonas, we don't want the webhook triggering endpoint to require scopes -- it is guarded with the triggerToken instead. Agree, > The webhook is already setup. Should I concentrate on digesting the payload part ?? I think hammad, might be right here. When looking at the code that was commented out in taskcluster-hooks does anyone remember why it was commented out? :) As I looked at it briefly I got the impression that it might implement everything we were looking for. -- So maybe we should uncomment it, land it and move on to figuring out how we can make a template system that can inject arbitrary payloads... I'm tempted to suggest something like: https://github.com/jonasfj/json-parameterization/blob/master/test.js - readme on the repo is out of date, also as the author I'm biased :) Perhaps json-parameterization isn't perfect, but the concept of parsing JSON and then injecting values is attractive. @hammad, If you think we can address this bug by uncommmenting code, please file a PR doing that :) Maybe see if you can think of any additional test cases we might be missing. Perhaps testing: resetTriggeToken followed by triggerHookWithToken Perhaps we should support other methods than just POST. Like: POST /hooks/:hookGroupId/:hookId/trigger/:token GET /hooks/:hookGroupId/:hookId/trigger/:token PUT /hooks/:hookGroupId/:hookId/trigger/:token I'm not sure, as GET and PUT are supposed to be idempotent.
Flags: needinfo?(jopsen)
I commented it out when I took over the service. From what I remember, it was because the tests weren't complete and webhooks weren't a priority at the time. So yes, a PR to remove the commenting-out, and run the tests, would be good. We will also need some UI changes in taskcluster-tools to allow users to generate tokens. Regarding templating -- json-parameterization might be a good start, but might need some extension -- perhaps it needs to support property access and array indexing? Let's do that in another bug, though -- I'd like to work through some use-cases before we jump into the implementation. Hammad, if you want to create that bug we can get started on it while finishing up here. Regarding methods, let's wait until we have a user requesting PUT. I don't think GET is ever a good idea.
@Dustin and @Jonas I'll file a PR and submit a patch. @Dustin I've never built anything like tempelating system mentioned by jonas. It will be an interesting task. I'll file a bug for that. His suggestion gives the gist of what is to be implemented :) @Jonas I have a problem while running tests. Whenever I run `npm test` test execution get stuck in some kind of a loop (though it may not be a loop in reality). Following is the output > API > createHook > ✓ creates a hook (102ms) > ✓ with invalid scopes > ✓ succeeds if a matching resource already exists > ✓ fails if different resource already exists > ✓ creates associated group > ✓ without a schedule there are 2 more tests in `createHook` suite which are not executed along with tests for other end points and the test code do not finish executing.
Flags: needinfo?(jopsen)
@Hammad, Well, there is only one way to find out why the process gets stuck. Try each test case on it's own (by commenting out other test cases). But probably it isn't looping, check your CPU usage. Most likely node isn't closing because it has a TCP socket to itself that it is keeping alive... Google around, I've sometimes solved it by explicitly destroying the http.Agent
Flags: needinfo?(jopsen)
@Jonas Thanks for your suggestion. I commented some tests and observed that tests were hanging because schedule for dailyTask is ["0 0 3 * * *"], hence task was not scheduled and blocked the tests. I changed the schedule to ["0 0 1-25 * * *"] and it worked. Yay :) Another issue is that I am not able to push my patch branch onto the taskcluster-hooks repo and file a PR, can you give me the access ?
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WORKSFORME
(WORKSFORME means "I'm not going to fix this because it already works for me")
Resolution: WORKSFORME → FIXED
status shows following options when I choose resolved - INVALID - WONTFIX - DUPLICATE - WORKSFORME - INCOMPLETE
oh, interesting.. we should get you `editbugs` permission
Component: Hooks → Services
You need to log in before you can comment on or make changes to this bug.