If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

Generate builds-4hr based on Pulse messages

NEW
Unassigned

Status

Release Engineering
General
2 years ago
5 months ago

People

(Reporter: armenzg, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

2 years ago
We find more and more bugs in our current workflow of status distribution (status db --> buildjson files).
See some bugs: bug 1127101, bug 1159279, bug 942616
Probably more.

We can generate a builds-4hr more accurate if we directly fed from Pulse.

If this is proven to be a better system we could replace the current buildjson generation (day file and 4hrs).

<catlee> I'd structure it as a listener with a durable queue
<catlee> you'd keep all the state in memory, and write out a new json file periodically
I think this is backward.  The only reason we're keeping builds-4hrs.json is because there's not enough information in pulse (or at least not organized meaningfully) for treeherder to consume it.  There's already a stalled-out project to fix that - bug 1026109.

If we have the development bandwidth to write a pulse-to-builds-4hr translator, I think that bandwidth would be better spent connecting pulse directly to treeherder, so that we can eliminate builds-4hr and friends.
(Reporter)

Comment 2

2 years ago
It's sad to see bug 1031238 as WONTFIX.
I was hoping this bug would be useful for:
1) fix current issues on buildjson generation which TH and mozci have to work around
2) a stepping stone to understand what exactly TH would need to switch to Pulse

How does job status reporting for TC work on TH?
It can be re-opened -- more an admission that nobody's working on it or likely to work on it, really.
Fixing that bug requires substantial changes to TH AIUI.

Generating builds-4hr from pulse would mean TH would continue to get information the same way it's getting it now, while also being a far more efficient way of capturing this data than our current method.

As far as builds-4hr is concerned, I'm pretty sure it's accurately represented by pulse. It's builds-{running,pending} that don't have correct events generated for them.
If we switch to this approach, let's generate builds-4hr locally on the TH servers, rather than writing it to a netapp every minute.

Comment 6

2 years ago
IMO if we're going to fix anything, we should just fix bug 942616, since everything else stems from that.

Comment 7

2 years ago
Sorry, to be clearer: of the three referenced bugs in comment 0, two are bug 942616, and the third is no longer needed by treeherder. I'm sure a fix for bug 942616 would be much less work than trying to fix the Pulse deficiencies/munge it into a form that we can use given the current Treeherder buildbot ETL. It's a year too late to be making Treeherder use Pulse, if we wanted to do that, the buildbot-Pulse bugs needed to have been fixed a long time ago.
(Assignee)

Updated

5 months ago
Component: Tools → General
Product: Release Engineering → Release Engineering
You need to log in before you can comment on or make changes to this bug.