Closed Bug 1035384 Opened 10 years ago Closed 8 years ago

Simplify mozharness vcs sync beagle (gecko-dev) config file

Categories

(Release Engineering :: Applications: MozharnessCore, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: pmoore, Unassigned)

Details

(Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/2194] )

Attachments

(1 file)

The file http://hg.mozilla.org/build/mozharness/file/411aed63044e/configs/vcs_sync/beagle.py currently has a lot of duplication. I've created a patch which reduces the amount of duplication, but hopefully maintains readability.
Wait for my review in bug 962863 before working on this.
Please note, where previously we had a tag_config defined at a target level that was already defined at the repo level (one higher up), I removed the config at the target level.

For example: I could remove tag_config at http://hg.mozilla.org/build/mozharness/file/411aed63044e/configs/vcs_sync/beagle.py#l192 because this is already defined at http://hg.mozilla.org/build/mozharness/file/411aed63044e/configs/vcs_sync/beagle.py#l204

I also removed empty tag_config entries like http://hg.mozilla.org/build/mozharness/file/411aed63044e/configs/vcs_sync/beagle.py#l480

After this, I kept the original beagle.py script, and added a beagle2.py script which was my new version. I ran https://raw.githubusercontent.com/petemoore/myscrapbook/49bfcd097e757faf90d30eef7dc7e92ba69c415c/refactor_beagle_test.py locally, to test that my changes did not introduce any differences in the config variable. The script does this by importing the old beagle.py, outputting the config dict to file, and doing the same with the new beagle2.py. When they were producing identical output, I replaced beagle.py with the new version (beagle2.py) which is how I created this patch.
Assignee: nobody → pmoore
Status: NEW → ASSIGNED
Attachment #8451841 - Flags: review?(aki)
(In reply to Aki Sasaki [:aki] from comment #1)
> Wait for my review in bug 962863 before working on this.

Sorry, too late! I've already done it. =)
Comment on attachment 8451841 [details] [diff] [review]
bug1035384_mozharness_v1.patch

This is a very precise config file.  I very very strongly dislike the programmatic change here.  What huge win are we gaining here at the expense of precision?
Attachment #8451841 - Flags: review?(aki) → review-
So, in my view, simplicity. If you want the expanded format, you get it in the log, or you can generate it. This also lends itself better to maintenance in the general form - the settings are not duplicated, so if you need to make a change (e.g. add a mapper url) you just do it in one place. You can be sure your configs are consistent with each other.

It allows the reader to see at a glance that all the branches are treated the same, with the exception of the few exceptions listed. It means adding a new branch is probably adding a single name in a list, rather than copying and pasting a long section. It easily allows you to identify when parts have got out of sync (or rather, it helps them to stop getting out of sync - since they are defined in fewer places). This is a good example - because in the process of reigning in the config to this simplified version, I established places where the configs were not consistent (e.g. some branches had redundant tag_configs, others didn't - they weren't consistent). This was not trivial to do at a glance, it was only through performing this relatively time consuming process, that this could be established.

In short, I also understand your side - when there is a particular problem on a particular branch, if you understand how the config is structured, it is nice for it to be verbose, so that you are sure you are only touching something that directly affects the branch you are concerned with. This may help the "branch specific" problems we have. But I think what you lose from the extra verbosity is the clean overview, the guarantee that configs stay consistent with each other, and it becomes harder to perform a "higher" level of automation (e.g. something to generate the list of branches) because unless you explicitly read through each config section, and visually compare them to see how the data is determined, you will not have an overview. So it all becomes a bit "muddy". Also, the type of problems were a global change across branches is needed, becomes much simpler (and not all problems are related to tweaking an individual branch config).

So I think if you want the clarity of seeing the final generated config for a particular branch, it is trivial to generate the config, and then output it, and it is verbosely outputted in each log file anyway.

However, I do see both sides, and having an extra step to spit out the config might sometimes be more of a pain. We have something similar with buildbot configs (thinking of dump master). So I see both sides, and think it is a balancing act.
(In reply to Pete Moore [:pete][:pmoore] from comment #5)
> the process of reigning in the config to this simplified version, I
> established places where the configs were not consistent (e.g. some branches
> had redundant tag_configs, others didn't - they weren't consistent). This
> was not trivial to do at a glance, it was only through performing this
> relatively time consuming process, that this could be established.

Which are these?  I'm not going to pretend I'm perfect, so it's possible it was a mistake.  However, I did intentionally have duplication in the tag configs.  Essentially,

* in-branch tag configs determine which tags we pull into the conversion repo
* in-target tag configs determine which tags we push to the target repo

Those two are related, and in the case of converting a single repo to push to a single repo with all tags converted, it is redundant.  It is no longer redundant when you're pulling tags from many repos into a single conversion repo and want to strictly control what tags are converted and pushed.  Limiting it on the pull side prevents you from converting it in the first place, and is the ideal place to limit.  Limiting it on the push side is a failsafe.

We may eventually have further divergence.  We have been getting multiple requests for release tags and relbranches from mozilla-{beta,release,esr*} to be converted and pushed to git.  I have explained why I'm not converting release tags here: http://escapewindow.dreamwidth.org/240669.html  However, we may add relbranch support for now, which would require more divergence for esrs, beta, and release; if we do stop moving hg tags for releases we can then add release tags to those repos and those repos only.  The current layout encourages change that is limited to a known subset of repos.  Your refactor discourages this, especially if the exceptions start to span multiple screens' worth of changes like they do in our buildbot-configs.  Yes we can dump the new config, but to see the diff between configs historically, we have to update mozharness to that revision and re-dump, then diff the logs.  With the explicit layout each explicit config is viewable and annotatable in source.

We have already hit a namespace collision with relbranches across different hg repos, as noted here: http://escapewindow.dreamwidth.org/241453.html and here: https://wiki.mozilla.org/ReleaseEngineering/VCSSync/HowTo#How_to_deal_with_GECKO90_2011121217_RELBRANCH .  This is more easily dealt with if we keep each repo at the minimum viable list of branches and tags, and becomes more of a collision threat as we expand it across repos.

gecko-dev and gecko.git are highly politicized and have high cost if they become polluted and need destructive edits.  As I mention here: http://escapewindow.dreamwidth.org/238476.html , it's safest to not make changes to git history if there are enough users downstream to form a lynch mob.

I think you've seen I'm not averse to simplified/programmatic configs: build/*, l10n.  I see the value, when the repo is less critical.  For this one I think copy/pastes and explicit annotatable diffs in version control are the better option.
Component: Tools → Mozharness
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/2181]
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/2181] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/2191]
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/2191] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/2194]
Assignee: pmoore → nobody
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: