Closed Bug 1622336 Opened 5 years ago Closed 5 years ago

Query the 'bugbug' schedules service as early as possible

Categories

(Firefox Build System :: Task Configuration, task, P3)

task

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ahal, Assigned: armenzg)

References

(Blocks 1 open bug)

Details

The new optimizer we're implementing in bug 1603459 queries a service to retrieve the tasks that should run on a given push. The service can take awhile (~5+ min) to respond (the current implementation busy waits for a response).

We can cut out most (all?) of this overhead by kicking off the initial query as early as possible.

E.g, we could kick this off as early as possible in the decision so that the service is computing results in parallel as the full task graph is generating. Though, this would only reduce ~2 minutes of overhead.

Better solutions would be to either:

  1. Kick this off via a push hook running on hg.m.o
  2. Have something listening to pulse to kick it off (possibly even the service itself)
  3. Implement a taskcluster hook to do it (???)

This way the service is already starting to compute results before the decision task is even running.

I suspect that listening to pulse somehow (possibly via a taskcluster hook) is the going to be a good solution to this.

I marked this low(er) priority for now, which will be true as long as mach try auto is the only way to hit the service. We'll want to make it a blocker to calling the service on autoland though.

Possibly the best options are:

  1. Listen for hgmo pulse message directly in the HTTP service;
  2. Trigger a hook on the hgmo pulse, which simply performs a request to the HTTP service.

I probably prefer the first, keeping things close together.

(In reply to Marco Castelluccio [:marco] from comment #3)

  1. Listen for hgmo pulse message directly in the HTTP service;

Yeah, if there is an existing persistent process that can listen to pulse, that would be a good solution.

Assignee: nobody → armenzg

Armen, between the two solutions, it'd be probably be better to go with (1). This way we can keep things as close together as possible (ease of deployment, no syncing problems, etc.).
I'm also open to 2 if you prefer it for some reason I haven't thought of. If so, I think the hook should be defined in bugbug (similarly to the other hooks we have in https://github.com/mozilla/bugbug/tree/master/infra), so at least we can deploy both the HTTP service and the hook at the same time (managing rolling back would still be worse than with option 1, that's why I prefer that one).

An example hook https://github.com/mozilla/bugbug/blob/master/infra/taskcluster-hook-classify-patch.json and an example task to build the hook https://github.com/mozilla/bugbug/blob/ca735e9f7caf001edd9dca252d1844fa83bfb4ff/infra/data-pipeline.yml#L1243-L1273
though it might be preferable to just listen for pulse in the bugbug HTTP service, rather than defining a hook

Note, the community instance does not have access to Hg Pulse messages.

Should we add an Hg Pulse listener in bugbug and hit the hook from bugbug?

Other options are:

if the hg team wanted to start sending messages [to the community instance], that could be arranged
a possible workaround is to set up a hook in firefox-ci that calls hooks.triggerHook on the community deployment
setting up a hook in firefox-ci would be something that you'd need to do via ci-config
I guess the third option is just to run the tool in the firefox-ci deployment

I don't have strong opinions. But judging by comment 3, setting up a pulse listener in bugbug was marco's preferred solution.

Marco, I believe you wanted me to run the listener as a background process as part of one of the Docker containers.
Should I be using bugbug-http-service-bg-worker or bugbug-http-service-rq-dasboard?

I thought I documented it somewhere. Which code should I call when there's a Pulse message?
Which repos should I be paying attention to?

I've been trying to modify the Dockerfile to run the ingestion as a background process.
Unfortunately I don't know how to get the output to come to the foreground.
I've managed to redirect the output to a file, however, that file is inside the Docker container.

Would it make sense to run the Pulse listener with its own dyno instead of piggybacking?

diff --git a/http_service/Dockerfile b/http_service/Dockerfile
index 85444fa..7229d4b 100644
--- a/http_service/Dockerfile
+++ b/http_service/Dockerfile
@@ -10,4 +10,5 @@ RUN pip install --disable-pip-version-check --quiet --no-cache-dir -r /requireme
 COPY . /code/http_service
 RUN pip install --disable-pip-version-check --quiet --no-cache-dir /code/http_service

-CMD gunicorn -b 0.0.0.0:$PORT bugbug_http.app --preload --timeout 30 -w 3
+# Run the Pulse listener in the background
+CMD (/code/http_service/pulse.py &> /code/http_service/pulse.log &) && gunicorn -b 0.0.0.0:$PORT bugbug_http.app --preload --timeout 30 -w 3

If I run it by hand:

armenzg@Armens-MacBook-Pro http_service % docker-compose run bugbug-http-service-rq-dasboard bash
root@ea8bd74a0110:/# /code/http_service/pulse.py & && gunicorn -b 0.0.0.0:$PORT bugbug_http.app --preload --timeout 30 -w 3
bash: syntax error near unexpected token `&&'
Flags: needinfo?(mcastelluccio)

(In reply to Armen [:armenzg] from comment #9)

Marco, I believe you wanted me to run the listener as a background process as part of one of the Docker containers.
Should I be using bugbug-http-service-bg-worker or bugbug-http-service-rq-dasboard?

I think the best option is to have the listener in the HTTP service as a background process, like you're trying to do.

I thought I documented it somewhere. Which code should I call when there's a Pulse message?

It should basically be equivalent to this already existing HTTP API: https://github.com/mozilla/bugbug/blob/e4ef1176cafefcbb77749f00c7a38dab9fcc8e7b/http_service/bugbug_http/app.py#L528.

Which repos should I be paying attention to?

Autoland and try.

(In reply to Armen [:armenzg] from comment #10)

I've been trying to modify the Dockerfile to run the ingestion as a background process.
Unfortunately I don't know how to get the output to come to the foreground.
I've managed to redirect the output to a file, however, that file is inside the Docker container.

Something like (COMMAND1 &) && COMMAND2 seems to work for me.

Would it make sense to run the Pulse listener with its own dyno instead of piggybacking?

The HTTP service dyno is pretty light at the moment, so I think it'd make sense to add it here (otherwise we have to pay to get another dyno, which would most of the time have almost no work to do).

Flags: needinfo?(mcastelluccio)

\o/

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.