Closed Bug 1571361 Opened 3 years ago Closed 3 years ago

Identify job ranges to backfill given perf alert

Categories

(Tree Management :: Perfherder, task, P2)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: igoldan, Assigned: igoldan)

References

Details

Attachments

(1 file)

No description provided.
Summary: Identify job ranges to b backfill given perf signature → Identify job ranges to backfill given perf signature
Blocks: 1571364
Assignee: nobody → igoldan
Priority: -- → P1
Summary: Identify job ranges to backfill given perf signature → Identify job ranges to backfill given perf alert

My initial approach was to get the job ids of the jobs I want to retrigger.
To get a better picture of this, let's use this graph. It's zoomed into the suspect range I would like to backfill & retrigger. I've highlighted the start and the end of the suspect range (6941948e4cba & 4f7affad853b). The alert I want to backfill around is in the middle, at 9f55691b5cfe.

A suspect range for this would look like [259573663, 259578914, 258617143, 258617049, 258004547].

However, the query for fetching it isn't all that straightforward & seems to require some wicked Django ORM voodoo.

Another alternative I see is getting a list of (push.id, job_type.id). Would it suffice for retriggering jobs on Taskcluster?
I ask because defining its query seems more reasonable.

Flags: needinfo?(cdawson)

With the current performance/summary endpoint I'm using for the final Graphs pr, the push_id and job_id will be available in the UI for each dataPoint. Does this help you? As an example: https://treeherder.mozilla.org/api/performance/summary/?repository=autoland&signature=1925819&framework=1&interval=1209600&all_data=true

(In reply to Sarah Clements [:sclements] from comment #2)

With the current performance/summary endpoint I'm using for the final Graphs pr, the push_id and job_id will be available in the UI for each dataPoint. Does this help you? As an example: https://treeherder.mozilla.org/api/performance/summary/?repository=autoland&signature=1925819&framework=1&interval=1209600&all_data=true

I think this is going to be useful! Thanks!

Flags: needinfo?(cdawson)

Looking closer over the endpoint logic, I noticed that each signature JSON object has the same data field, according to this line.

I think this is a small bug, as each signature should contain only its specific data, not from all signatures we requested for.

Basically, instead of

item['data'] = data.values('value', 'job_id', 'id', 'push_id', 'push_timestamp', 'push__revision').order_by('push_timestamp')

use

item['data'] = data.filter(signature_id=item['id']).values('value', 'job_id', 'id', 'push_id', 'push_timestamp', 'push__revision').order_by('push_timestamp')

Wouldn't you agree?

Flags: needinfo?(sclements)

Well, there's two different ways to use this API - the default is to return all signatures and their data such as what's used by the compare view: https://treeherder.mozilla.org/api/performance/summary/?repository=mozilla-central&framework=1&interval=172800&no_subtests=true&startday=2019-08-05T21%3A52%3A12&endday=2019-08-07T21%3A52%3A12

A recent change is the addition of the all_data param. If signature and all_data are provided, it'll return only that signatures' data, along with additional fields.

Signature is filtered on here: https://github.com/mozilla/treeherder/blob/master/treeherder/webapp/api/performance_data.py#L431

So then signature_ids will only be that one signature, when it's filtered on here: https://github.com/mozilla/treeherder/blob/master/treeherder/webapp/api/performance_data.py#L453

Flags: needinfo?(sclements)

This just got merged to master.

Priority: P1 → P2

This got deployed to production.

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.