Closed Bug 912185 Opened 11 years ago Closed 9 years ago

Streamline talos.json updates and talos pushes

Categories

(Testing :: Talos, defect)

x86_64
Windows 7
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 787200

People

(Reporter: avih, Unassigned)

Details

As far as I understand, the current talos updates system works as follows:

1. mozharness grabs the revision id from talos.json, and uses this revision for production runs of talos on the slaves.

2. Patches are pushed to the talos repo.

3. talos.json updates, and takes all patches from #2 into production.

The practical problem with such system is that it's hard to associate talos.json updates with specific pushes to the talos repo, it's hard to follow when talos slaves got updated and what changes went in, it's hard to know when a bug which landed actually made it into production, and all those things aren't tied together.


I suggest the following system to streamline talos repo pushes and talos.json updates:

1. We file a "Master talos.json updates" bug. This bug depends on all individual talos.json update bugs.

2. Whenever talos.json got updated, say, on date:time AA:BB, we file a new bug "talos.json update following AA:BB", and make this bug block the master bug.

3. Whenever a patch is pushed into talos (say in bug X), we make bug X depend on this AA:BB bug (which is easy to identify since it's the latest dependency of the master bug), since X will take effect only after AA:BB is resolved.

4. Once AA:BB is pushed and talos.json updates, goto 2.


Does this sound practical? useful? alternatives? gotchas?
we need a better system for managing talos updates, and this proposed system covers the bulk of it.  Keep in mind right now that we have all mobile tests requiring talos.zip changes and all desktop tests requiring talos.json changes, which adds to the confusion.
Since talos.zip is pretty much an equivalent to talos.json revision id (except they don't necessarily update together), we could either form a parallel system for talos.zip updates, and depend bugs on either/both (depending if the talos pushes are mobile/desktop/both related).

Or we could live with the current talos.zip system until it's replaced with mozharness. I guess it mostly depends on how long we expect talos.zip to stay around.
Also, since some talos pushes may also require - to make them effective, graph server updates and/or datazilla updates (and others?), which suffer from the same lack of overall association and tidiness, then maybe we could have several such systems for those as well.

The whole point of this proposal is that when a talos change is pushed in bug X, to make it clear and well associated what other changes should follow before the X change becomes effective, and make depend X on those.

So for example, if a new talos push at bug X affects the following:
- Desktop talos runs (-> "talos.json following AA:BB")
- Mobile talos runs (-> "talos.zip following CC:DD")
- graph server update (-> "graphserver update following EE:FF")
- datazilla update (...)
- ...

The main difference between this suggestion and the current system is that we'll be able to depend bug X on other required bugs right when X is pushed into talos. While the current system requires filing blocking followup bugs for all the required derivatives, which is easy to get wrong, and very hard to follow if dependencies are not managed correctly.
in trying to understand this and implement this I have only had confusion.  I think a lot of our confusion comes from:
* mozharness/revision vs talos.zip (will be the case until tegras are gone)
* graph server requirements vs datazilla (by end of year we might be fully on datazilla)

Removing those two variables, we would just need to map a set of checking to the talos repository to a single change in talos.json.  This could be done with version markers or any other measurement.  It would be a lot clearer.
we are really close to moving talos in tree (bug 787200), when that happens this bug will be obsolete, marking it as a duplicate since we have no intentions of doing work here and all the intentions of resolving bug 787200 in the short term.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.