Closed Bug 1149504 Opened 9 years ago Closed 9 years ago

Taskcluster pluggable event adapter service container

Categories

(Taskcluster :: General, defect)

x86
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: pmoore, Unassigned)

References

Details

There has been some interest raised (most recently by rail) about providing some service(s) that makes it easier to automatically submit task graphs based on a) messages published on pulse and b) source code changes.

There are other types of events that lend themselves to a task publishing service, such as nagios alerts, cron timers, SNMP events, irc message alerts, ....

I'm therefore considering the idea of a pluggable service container, that allows custom event adapters to be deployed for generating/publishing custom task graphs.

Imagine a pluggable container that allows you to deploy arbitrary event adapter plugins (e.g. amqp event adapter, vcs event adapter, snmp event adapter, nagios event adapter, ...) and each adapter is then responsible for listening to the type of events it can process, and generate a task graph.

Some adapters may be mozilla-specific, others might be suitably generic that they warrant inclusion as part of the container module.

The idea would be an overarching service with REST api endpoints on the container, much like the Queue or the Scheduler. At the top level, the endpoints allow you to add/delete/modify event adapters.

Each event adapter would be defined as a pluggable software module, a json definition of the adapter request payload (including doc comments), and deployment configuration settings for deployment into the container.

Each adapter should retain no state but use backing storage, such that the service can be scaled up by deploying more instances of the adapter, elastically.

The container provides API endpoints for adding plugins. The "add plugin" endpoint would receive a json definition of the adapter service such that it can deploy the adapter service. Once deployed, the adapter service initially has nothing to do. There would be other API endpoints (of the container) that would allow you to add/delete/modify entries consisting of "an event listener + custom script" for a specific event adapter. For example, let's say you have a pulse event adapter. In its json payload definition, it would state that it requires an exchange name and a routing key for its event listener. The container having consumed this would then allow a service requests which had a json payload of "adapter: pulse-event-adapter, exchange name: x, routing key y, script: runme.sh". Note the API endpoints are on the container, not on the adapter, so the adapter is not concerned with serving API requests. The container can validate the request against the plugin's json definition, and if it conforms, can notify the adapter to add the payload to its list of payloads to process. So in this case it would pass a message to the pulse event adapter plugin, requesting it to serve exchange name: x, routing key y:, script: runme.sh.

The adapter would then bind to exchange name x, routing key y, and consume messages. For each message it would consume, it could e.g. upload the amqp message body to s3 storage, upload the runme.sh script it received to s3, and then generate a task graph which downloads the message body from s3, downloads script runme.sh from s3 and then runs script runme.sh <filename_of_amqp_payload>.

OK I realise this has been rather wordy and super well structured, but hopefully it gets the point across for initial feedback.

In short, there would be a scalable service with a pluggable architecture, that allows you to create new event adapters that are able to translate certain events into the generation and publication of task graphs. These adapters could be time based, event based, listeners, pollers, whatever you like. They should be stateless so that the container can scale them, and they should be self-describing so that all API requests can be channelled through the overarching service to add new events to listen to. Creation of new adapters should be trivial, scaling of the service should be trivial, and updating configs of adapters (i.e. defining which actual events they should be generated for) should be trivial, and the generated tasks should be sufficiently generic that you can write a script to do whatever you like - you just need to be able to process the event body (be it a nagios alert, an AMQP message, an SNMP event, etc....)

Quite a big job though... :p
See Also: → 1149789
So my thoughts around this were different, I wanted to write the comment here but the idea is
dramatically different so I posted it as bug 1149789.
----------

IMO, this too complicated. If I understand you correctly the service would basically be running the
adaptor containers all the time. And one would be able to dynamically submit adaptor containers.
Hosting and scaling a service of containers is not trivial. We could probably do it on tutum.co.

But one cannot just horizontal scale any adaptor. Imagine a pushlog adaptor, it cannot be scaled
to multiple containers. That would mean message duplication (which is allowed on pulse), but it
wouldn't really mean that we get messages faster.

Anyways, I don't think we should stand-up or own adaptors for all sources of events.
We should ask people to post to pulse. Ie. we shouldn't write a pushlog adaptor that posts messages
to pulse, that should be part of the post-commit/push hook in the repository.

We might stand up an adaptor for github, but generally speaking people should publish pulse messages
when there is interesting events.

The proposal in bug 1149789, we route everything through pulse and move the responsibility of
publishing pulse messages to owners of the event producing services. Routing everything through
pulse has the clear benefit that others can trigger on these events too. For completely unrelated
things.

Imagine a dashboard showing pushes that uses taskcluster WebListener to listen for pulse messages
from the post-commit/push hook over websockets. Or someone listening for post-commit/push hook messages
regarding try using pulse in a background worker in-order to aggregate statistics about when to start
the coffee machines because everybody just pushed to try :)
(okay, maybe there is better use cases)
You make a fair point, and I think I did make it overcomplicated trying to support multiple adapters. I'm happy to close this, and prefer your alternative approach. Thanks Jonas!
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
Component: TaskCluster → General
Product: Testing → Taskcluster
Target Milestone: --- → mozilla41
Version: unspecified → Trunk
Resetting Version and Target Milestone that accidentally got changed...
Target Milestone: mozilla41 → ---
Version: Trunk → unspecified
You need to log in before you can comment on or make changes to this bug.