Closed Bug 1253369 Opened 8 years ago Closed 8 years ago

Notifications on release promotion events

Categories

(Release Engineering :: Release Automation: Other, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: rail, Assigned: csheehan)

References

Details

Attachments

(2 files, 2 obsolete files)

We need to send notifications to interested parts about release promotion events. If anything fails or passes we need to know.

We can add routes to all tasks (and tests!) to make sure we can subscribe to some routing key patterns and process the events.

Something like the following may help:

For update verify:

  - index.relpro.v1.{{ branch }}.{{ revision }}.{{ product }}.build{{ buildNumber }}.update_verify.{{ platform }}.{{ chunk }}
  - index.relpro.v1.{{ branch }}.latest.{{ product }}.build{{ buildNumber }}.update_verify.{{ platform }}.{{ chunk }}


For l10n:

  - index.relpro.v1.{{ branch }}.{{ revision }}.{{ product }}.build{{ buildNumber }}.repack.{{ platform }}.{{ chunk }}
  - index.relpro.v1.{{ branch }}.latest.{{ product }}.build{{ buildNumber }}.repack.{{ platform }}.{{ chunk }}

etc.

We may also need to generate a root notification, so we can track the start of the process.

Once we have routes in place we can build a (heroku!) service to listen to these events and send more notifications: SNS, email, SMS, IRC, maybe another normalized pulse stream.
Here is another idea.

We can extend our current tasks and add something like

task:
  ...
  extra:
      notifications:
          success:
              to:
                - all
                - releng
              subject: {{ product }} {{ version }} build {{ buildNumber}} updates are available on {{ channel }}
              body: |
                   {{ product }} {{ version }} build {{ buildNumber}} updates are available on the {{ channel }} channel now.
                   Task: https://tools.taskcluster.net/task-inspector/#{{ stableSlugId(buildername) }}

          failure:
              to:
                - releng
              subject: Achtung! {{ product }} {{ version }} build {{ buildNumber}} updates are b0rken on {{ channel }}
              body: |
                   HALP!
                   {{ product }} {{ version }} build {{ buildNumber}} updates are b0rken on the {{ channel }} channel now.
                   Task: https://tools.taskcluster.net/task-inspector/#{{ stableSlugId(buildername) }}



Then we can use these data to generate appropriate messages.
(In reply to Rail Aliiev [:rail] from comment #2)
> Here is another idea.
> 
> We can extend our current tasks and add something like
> 
> task:
>   ...
>   extra:
>       notifications:

I like that this approach keeps the template in the graph template :)

so something would watch the task status and then depending on TC result, would use this 'notifications' template to send out email/sns/etc?

are you still thinking a new heroku service to be that 'something'?
(In reply to Jordan Lund (:jlund) from comment #3)
> so something would watch the task status and then depending on TC result,
> would use this 'notifications' template to send out email/sns/etc?

The app would query the task (with interpolated values) and if there is task.extra.notifications, it would act on those.

The to: in this case is something like "tags", they are not real values. The app may decide what and how to send send notifications depending on those. Something like "urgent" may escalate and use different channels of communication.

> are you still thinking a new heroku service to be that 'something'?

Yeah, "heroku" in this case something running in parallel as a separate service. Heroku so far sounds like the easiest way to go.
Depends on: 1253954
(In reply to Rail Aliiev [:rail] from comment #2)
>               subject: {{ product }} {{ version }} build {{ buildNumber}}
> updates are available on {{ channel }}
>               body: |
>                    {{ product }} {{ version }} build {{ buildNumber}}
> updates are available on the {{ channel }} channel now.
>                    Task: https://tools.taskcluster.net/task-inspector/#{{
> stableSlugId(buildername) }}

Would that mean that this could be used for an automatic triggering of update tests?
(In reply to Henrik Skupin (:whimboo) from comment #5)
> Would that mean that this could be used for an automatic triggering of
> update tests?

I think it'd be better to use already existing index routes (bug 1253954 to make them prettier) to listen all release related tasks, except en-US builds. You can browse one of them (not fully completed though) at https://tools.taskcluster.net/index/#releases.v1.date.e1da13e006f645900a95c2f8a64fe4968b0d50b2.firefox.46_0b3.build1/releases.v1.date.e1da13e006f645900a95c2f8a64fe4968b0d50b2.firefox.46_0b3.build1
Rail, my last question was really a question regarding update tests because it seems that those notifications go out when you actually open a channel. So it's kinda interesting for me to know this is something we could use in the near future to trigger our update tests automatically.

Regarding our functional tests which are getting executed once builds are available... for what specific should I look out for? Are those the l10n namespaces?

index.releases.v1.date.e1da13e006f645900a95c2f8a64fe4968b0d50b2.firefox.46_0b3.build1.l10n.linux.X
index.releases.v1.date.latest.firefox.latest.l10n.linux.X

Are you still working for a solution on the en-US case?
Flags: needinfo?(rail)
(In reply to Henrik Skupin (:whimboo) from comment #7)
> Regarding our functional tests which are getting executed once builds are
> available... for what specific should I look out for? Are those the l10n
> namespaces?
> 
> index.releases.v1.date.e1da13e006f645900a95c2f8a64fe4968b0d50b2.firefox.
> 46_0b3.build1.l10n.linux.X
> index.releases.v1.date.latest.firefox.latest.l10n.linux.X

Yes, something like https://tools.taskcluster.net/index/#releases.v1.mozilla-beta.fb3494d06dfb73e26df72ca7a4bc4ef5ebf8795c.firefox.46_0b1.build5.l10n/releases.v1.mozilla-beta.fb3494d06dfb73e26df72ca7a4bc4ef5ebf8795c.firefox.46_0b1.build5.l10n should just work

 
> Are you still working for a solution on the en-US case?

There is no "en-US" build ready, but as a work around you can use the "beetmover" steps: https://tools.taskcluster.net/index/#releases.v1.mozilla-beta.fb3494d06dfb73e26df72ca7a4bc4ef5ebf8795c.firefox.46_0b1.build5.beetmover.en_US/releases.v1.mozilla-beta.fb3494d06dfb73e26df72ca7a4bc4ef5ebf8795c.firefox.46_0b1.build5.beetmover.en_US
Flags: needinfo?(rail)
(In reply to Rail Aliiev [:rail] from comment #8)
> > index.releases.v1.date.e1da13e006f645900a95c2f8a64fe4968b0d50b2.firefox.
> > 46_0b3.build1.l10n.linux.X
> > index.releases.v1.date.latest.firefox.latest.l10n.linux.X
> 
> Yes, something like
> https://tools.taskcluster.net/index/#releases.v1.mozilla-beta.
> fb3494d06dfb73e26df72ca7a4bc4ef5ebf8795c.firefox.46_0b1.build5.l10n/releases.
> v1.mozilla-beta.fb3494d06dfb73e26df72ca7a4bc4ef5ebf8795c.firefox.46_0b1.
> build5.l10n should just work

For me links to the specific task would be more helpful. So I can see what has been done and which artifacts have been created. I already asked for more details on https://github.com/mozilla/mozmill-ci/issues/764 but lets continue here... 

So what kind of task would the above apply to? Is it the beetmover one? If not when will l10n builds be uploaded to archive.mozilla.org? Would this task do it or do we have to wait for a l10n beetmover task to finish? If the latter is the case, I'm missing a properties.json file as artifact with details. So we want the files from archive.mozilla.org so we can run our tests for the binaries at the final location.

> > Are you still working for a solution on the en-US case?
> 
> There is no "en-US" build ready, but as a work around you can use the
> "beetmover" steps:
> https://tools.taskcluster.net/index/#releases.v1.mozilla-beta.
> fb3494d06dfb73e26df72ca7a4bc4ef5ebf8795c.firefox.46_0b1.build5.beetmover.
> en_US/releases.v1.mozilla-beta.fb3494d06dfb73e26df72ca7a4bc4ef5ebf8795c.
> firefox.46_0b1.build5.beetmover.en_US

Similar to the above question I assume it will be a task like that: https://tools.taskcluster.net/task-graph-inspector/#iAl0ZL1wRvqi4twlHaRxkw/yPkBj0jWTYS-L680NxdgOQ/ with the name "[beetmover] firefox mozilla-beta linux en_US completes candidates"? If yes, its also missing a properties.json file.
Flags: needinfo?(rail)
The idea is to listen to en-US and locale beetmover tasks (routing key should look something like route.releases.v1.*.*.firefox.*.*.beetmover.*.*, see example https://tools.taskcluster.net/index/#releases.v1.mozilla-beta.fb3494d06dfb73e26df72ca7a4bc4ef5ebf8795c.firefox.46_0b1.build8.beetmover/releases.v1.mozilla-beta.fb3494d06dfb73e26df72ca7a4bc4ef5ebf8795c.firefox.46_0b1.build8.beetmover) and schedule functional tests. This should address the NI I think?
Flags: needinfo?(rail)
Attachment #8729512 - Flags: review?(jlund)
Attachment #8729512 - Flags: review?(jlund) → review+
Attachment #8729512 - Flags: checked-in+
I use https://github.com/rail/pulse-notify (requires python 3.5) to track current release process. Can be used as a base for this bug.
So I was able to fetch both the en-US and l10n repack messages. They are processed correctly and a Jenkins job is triggered. Looks like all what I need is there. Thanks Rail!
Is there any other work necessary here or can this bug be closed?
(In reply to Henrik Skupin (:whimboo) from comment #13)
> Is there any other work necessary here or can this bug be closed?

We want to have at least email notifications here.
Played with tascluster-client a bit yesterday and ended up with something like https://github.com/rail/pulse-notify-js written in JS.
Assignee: nobody → csheehan
I have been doing some work on this at https://github.com/cgsheeh/pulse-notify/.

Essentially what I have done so far is create a listener for Pulse messages (forked from Rail's earlier implementation) that initializes a set of drop-in 'plugins'. A message will come into a taskcluster exchange such as task-completed, task-defined etc. The consumer will check the task to see if there is a notification specified for the specific exchange the message passed through. If the notification section specific to the exchange exists within the task, the enabled plugins will call their 'notify' methods to take action based on the task definition. An example of a task that would notify on completion is https://queue.taskcluster.net/v1/task/egEhR5z0TFCkLkuFB-vDWQ

As of now I have four plugins to notify with Amazon SES/SNS, SMTP and IRC (drops messages into a dedicated channel). There is also a plugin that grabs log files and uploads them to an Amazon S3 bucket, which will be added to the other plugin messages if enabled. Moving forward now I am going to use InfluxDB to check how long each plugin is taking to notify and check for any significant bottlenecks, however some notifications will need to be added to existing tasks to do this (otherwise I will have to trigger custom tasks).
This diff defines 'notifications.yml.tmpl' which is a jinja2 template for adding a notification section to the 'extra' section in a build task definition. 

In 'release_graph.yml.tmpl' the macro is defined for creating notifications. The goal is to be able to specify which task status to be notified on, and how to be notified. If no notification is required for a specific status, just leave it blank.

For jinja2 to access the template, the main templates dir is passed to the FileSystemLoader parameter in the environment definition, as in it's current form it only accesses templates for firefox.

Would appreciate some feedback w.r.t. all of the above, after which I will be adding the macro call to each template.
Attachment #8763688 - Flags: feedback?(kmoir)
Attachment #8763688 - Flags: feedback?(jlund)
Comment on attachment 8763688 [details] [diff] [review]
Adds new template to notification section in release tasks, macro to add into tasks easily and changes jinja2 environment definition in make_task_graph

Review of attachment 8763688 [details] [diff] [review]:
-----------------------------------------------------------------

fyi: bugzilla [kind of] supports github pull requests. you can create the PR and then in bugzilla, create an attachment but rather than uploading a patch file, paste the PR url in the raw input text box.

::: releasetasks/__init__.py
@@ +21,4 @@
>                      **template_kwargs):
>      # TODO: some validation of template_kwargs + defaults
>      env = Environment(
> +        loader=FileSystemLoader([path.join(template_dir, product), template_dir]),

we don't need `template_dir` anymore?

::: releasetasks/templates/notifications.yml.tmpl
@@ +10,5 @@
> +#}
> +notification:
> +    {% if completed is not none %}
> +    task-completed:
> +        {% if completed['subject'] is defined %}

this patch looks great! I wonder, can we enforce or make defaults for many of the options? that way you don't have to worry about so many conditions.
Attachment #8763688 - Flags: feedback?(jlund) → feedback+
(In reply to Jordan Lund (:jlund) from comment #19)
> Comment on attachment 8763688 [details] [diff] [review]
> Adds new template to notification section in release tasks, macro to add
> into tasks easily and changes jinja2 environment definition in
> make_task_graph
> 
> Review of attachment 8763688 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> fyi: bugzilla [kind of] supports github pull requests. you can create the PR
> and then in bugzilla, create an attachment but rather than uploading a patch
> file, paste the PR url in the raw input text box.
> 

Cool! I will do that once I've added the macro call to the other templates.

> ::: releasetasks/__init__.py
> @@ +21,4 @@
> >                      **template_kwargs):
> >      # TODO: some validation of template_kwargs + defaults
> >      env = Environment(
> > +        loader=FileSystemLoader([path.join(template_dir, product), template_dir]),
> 
> we don't need `template_dir` anymore?

To clarify, the call before my change looked like FileSystemLoader(path.join(template_dir, product)). The change I made adds the template dir because I added the notification template there. If I had added to the firefox dir, adding new products would require re-adding the notification template. If this breaks somehow I can always put the notification template in it's own folder.

> 
> ::: releasetasks/templates/notifications.yml.tmpl
> @@ +10,5 @@
> > +#}
> > +notification:
> > +    {% if completed is not none %}
> > +    task-completed:
> > +        {% if completed['subject'] is defined %}
> 
> this patch looks great! I wonder, can we enforce or make defaults for many
> of the options? that way you don't have to worry about so many conditions.

Yes, this is my main concern right now as well. The service only really requires plugins to be specified, it will make a subject/message from the task name if it is empty. The other details only matter if you enable a specific plugin (ie SES needs to know who to email). It might be easier to have the default in the template.

If specifying plugins is too much I could always set the default to try and notify using every possible method.
Comment on attachment 8763688 [details] [diff] [review]
Adds new template to notification section in release tasks, macro to add into tasks easily and changes jinja2 environment definition in make_task_graph

Jordan has provided feedback so I'll remove my name
Attachment #8763688 - Flags: feedback?(kmoir)
Attachment #8763688 - Attachment is obsolete: true
Attachment #8769729 - Flags: review?(rail)
Attachment #8769729 - Flags: review?(rail)
Second try, overlooked a formatting error in the first one and merged branch to my fork's master.
Attachment #8769729 - Attachment is obsolete: true
Attachment #8769833 - Flags: review?(rail)
Comment on attachment 8769833 [details] [review]
Add notification template and setup macro call to releasetasks

This was r+ed, merged and deployed
Attachment #8769833 - Flags: review?(rail) → review+
https://github.com/cgsheeh/pulse-notify

Here is the repo, without the new log organization change pushed (although it is pushed to the Heroku remote).
Flags: needinfo?(rail)
I started commenting in a separate branch https://github.com/cgsheeh/pulse-notify/compare/master...rail:comments?expand=1

I'll be pushing more.
Flags: needinfo?(rail)
Depends on: 1294425
Blocks: 1295133
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: