Bug 1562162 Comment 8 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

(In reply to Kyle Lahnakoski [:ekyle] from comment #7)
> :davehunt - I would need to know more about the process of retriggers.  Ask :camd or :armenzg, or :ahal to find out who knows more about the retrigger logic:  How do retrigger jobs get scheduled?  What markup do retrigger jobs have?  (how can we distinguish them from other jobs?)  I am sure there is a property value in Treeherder, or Taskcluster, that allows us to recognize backfills; we just need to find it.  
> 
> I did a weak attempt at this with arno__ on a call the other week, but could not find such a property.

Hi Kyle. Based on Joel's guidelines from comment 3, I've poked around with Treeherder's UI and came to the same conclusion as he did.
I only needed to run SQL queries on Treeherder's database.

Both `perf jobs` & `retrigger/backfill` jobs have associated Taskcluster ids, via the `taskcluster_metadata` table.

For retrigger jobs, there's a parent - child relation, where the retrigger job acts as the parent and the perf job as the child, with the help of these Taskcluster ids. Taskcluster is the service that's keeping the record with these relations.

We're able to find all perf jobs & all retrigger/backfill jobs. **But** we're missing the relation records Taskcluster has. These can allow us to filter only the perf jobs which got retriggered or backfilled.

To me, this sounds like ActiveData recipes remains the way to perform this query.
Another approach would be to simply link the Taskcluster dependency id to Treeherder's jobs. But I'm not sure about the implication details of this. :camd does this sound like an easy implementation task? Maybe adding a table column & updating an ingestion worker?

Either way, I want to split this task in 2, for start.
(In reply to Kyle Lahnakoski [:ekyle] from comment #7)
> :davehunt - I would need to know more about the process of retriggers.  Ask :camd or :armenzg, or :ahal to find out who knows more about the retrigger logic:  How do retrigger jobs get scheduled?  What markup do retrigger jobs have?  (how can we distinguish them from other jobs?)  I am sure there is a property value in Treeherder, or Taskcluster, that allows us to recognize backfills; we just need to find it.  
> 
> I did a weak attempt at this with arno__ on a call the other week, but could not find such a property.

Hi Kyle. Based on Joel's guidelines from comment 3, I've poked around with Treeherder's UI and came to the same conclusion as he did.
I only needed to run SQL queries on Treeherder's database.

Both `perf jobs` & `retrigger/backfill` jobs have associated Taskcluster ids, via the `taskcluster_metadata` table.

For retrigger jobs *(such as AC(rt))*, there's a parent - child relation, where the retrigger job acts as the parent and the perf job as the child, with the help of these Taskcluster ids. Taskcluster is the service that's keeping the record with these relations.

We're able to find all perf jobs & all retrigger/backfill jobs. **But** we're missing the relation records Taskcluster has. These can allow us to filter only the perf jobs which got retriggered or backfilled.

To me, this sounds like ActiveData recipes remains the way to perform this query.
Another approach would be to simply link the Taskcluster dependency id to Treeherder's jobs. But I'm not sure about the implication details of this. :camd does this sound like an easy implementation task? Maybe adding a table column & updating an ingestion worker?

Either way, I want to split this task in 2, for start.
(In reply to Kyle Lahnakoski [:ekyle] from comment #7)
> :davehunt - I would need to know more about the process of retriggers.  Ask :camd or :armenzg, or :ahal to find out who knows more about the retrigger logic:  How do retrigger jobs get scheduled?  What markup do retrigger jobs have?  (how can we distinguish them from other jobs?)  I am sure there is a property value in Treeherder, or Taskcluster, that allows us to recognize backfills; we just need to find it.  
> 
> I did a weak attempt at this with arno__ on a call the other week, but could not find such a property.

Hi Kyle. Based on Joel's guidelines from comment 3, I've poked around with Treeherder's UI and came to the same conclusion as he did.
I only needed to run SQL queries on Treeherder's database.

Both `perf jobs` & `retrigger/backfill` jobs have associated Taskcluster ids, via the `taskcluster_metadata` table.

For retrigger jobs *(such as AC(rt))*, there's a parent - child relation, where the retrigger job acts as the parent and the perf job as the child, with the help of these Taskcluster ids. Taskcluster is the service that's keeping the record with these relations.

We're able to find all perf jobs & all retrigger/backfill jobs. **But** we're missing the relation records Taskcluster has. These can allow us to filter only the perf jobs which got retriggered or backfilled.

To me, this sounds like ActiveData recipes remains the way to perform this query.
Another approach would be to simply link the Taskcluster dependency id to Treeherder's jobs. But I'm not sure about the implementation details of this. :camd does this sound like an easy implementation task? Maybe adding a table column & updating an ingestion worker?

Either way, I want to split this task in 2, for start.
(In reply to Kyle Lahnakoski [:ekyle] from comment #7)
> :davehunt - I would need to know more about the process of retriggers.  Ask :camd or :armenzg, or :ahal to find out who knows more about the retrigger logic:  How do retrigger jobs get scheduled?  What markup do retrigger jobs have?  (how can we distinguish them from other jobs?)  I am sure there is a property value in Treeherder, or Taskcluster, that allows us to recognize backfills; we just need to find it.  
> 
> I did a weak attempt at this with arno__ on a call the other week, but could not find such a property.

Hi Kyle. Based on Joel's guidelines from comment 3, I've poked around with Treeherder's UI and came to the same conclusion as he did.
I only needed to run SQL queries on Treeherder's database to check that & play around Taskcluster's UI.

Both `perf jobs` & `retrigger/backfill` jobs have associated Taskcluster ids, via the `taskcluster_metadata` table.

For retrigger jobs *(such as AC(rt))*, there's a parent - child relation, where the retrigger job acts as the parent and the perf job as the child, with the help of these Taskcluster ids. Taskcluster is the service that's keeping the record with these relations.

We're able to find all perf jobs & all retrigger/backfill jobs. **But** we're missing the relation records Taskcluster has. These can allow us to filter only the perf jobs which got retriggered or backfilled.

To me, this sounds like ActiveData recipes remains the way to perform this query.
Another approach would be to simply link the Taskcluster dependency id to Treeherder's jobs. But I'm not sure about the implementation details of this. :camd does this sound like an easy implementation task? Maybe adding a table column & updating an ingestion worker?

Either way, I want to split this task in 2, for start.
(In reply to Kyle Lahnakoski [:ekyle] from comment #7)
> :davehunt - I would need to know more about the process of retriggers.  Ask :camd or :armenzg, or :ahal to find out who knows more about the retrigger logic:  How do retrigger jobs get scheduled?  What markup do retrigger jobs have?  (how can we distinguish them from other jobs?)  I am sure there is a property value in Treeherder, or Taskcluster, that allows us to recognize backfills; we just need to find it.  
> 
> I did a weak attempt at this with arno__ on a call the other week, but could not find such a property.

Hi Kyle. Based on Joel's guidelines from comment 3, I've poked around with Treeherder's UI and came to the same conclusion as he did.
I only needed to run SQL queries on Treeherder's database & play around Taskcluster's UI to check that .

Both `perf jobs` & `retrigger/backfill` jobs have associated Taskcluster ids, via the `taskcluster_metadata` table.

For retrigger jobs *(such as AC(rt))*, there's a parent - child relation, where the retrigger job acts as the parent and the perf job as the child, with the help of these Taskcluster ids. Taskcluster is the service that's keeping the record with these relations.

We're able to find all perf jobs & all retrigger/backfill jobs. **But** we're missing the relation records Taskcluster has. These can allow us to filter only the perf jobs which got retriggered or backfilled.

To me, this sounds like ActiveData recipes remains the way to perform this query.
Another approach would be to simply link the Taskcluster dependency id to Treeherder's jobs. But I'm not sure about the implementation details of this. :camd does this sound like an easy implementation task? Maybe adding a table column & updating an ingestion worker?

Either way, I want to split this task in 2, for start.
(In reply to Kyle Lahnakoski [:ekyle] from comment #7)
> :davehunt - I would need to know more about the process of retriggers.  Ask :camd or :armenzg, or :ahal to find out who knows more about the retrigger logic:  How do retrigger jobs get scheduled?  What markup do retrigger jobs have?  (how can we distinguish them from other jobs?)  I am sure there is a property value in Treeherder, or Taskcluster, that allows us to recognize backfills; we just need to find it.  
> 
> I did a weak attempt at this with arno__ on a call the other week, but could not find such a property.

Hi Kyle. Based on Joel's guidelines from comment 3, I've poked around with Treeherder's UI and came to the same conclusion as he did.
I only needed to run SQL queries on Treeherder's database & play around Taskcluster's UI to check that .

Both `perf jobs` & `retrigger/backfill` jobs have associated Taskcluster ids, via the `taskcluster_metadata` table.

For retrigger jobs *(such as AC(rt))*, there's a parent - child relation, where the retrigger job acts as the parent and the perf job as the child, using the associated Taskcluster ids. Taskcluster is the service that's keeping the record with these relations.

We're able to find all perf jobs & all retrigger/backfill jobs. **But** we're missing the relation records Taskcluster has. These can allow us to filter only the perf jobs which got retriggered or backfilled.

To me, this sounds like ActiveData recipes remains the way to perform this query.
Another approach would be to simply link the Taskcluster dependency id to Treeherder's jobs. But I'm not sure about the implementation details of this. :camd does this sound like an easy implementation task? Maybe adding a table column & updating an ingestion worker?

Either way, I want to split this task in 2, for start.

Back to Bug 1562162 Comment 8