Closed
Bug 1499047
Opened 6 years ago
Closed 6 years ago
Tracking bug for 2018-12-03 migration work
Categories
(Release Engineering :: Release Requests, enhancement)
Release Engineering
Release Requests
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: mtabara, Unassigned)
References
Details
+++ This bug was initially created as a clone of Bug #1489406 +++
+++ This bug was initially created as a clone of Bug #1480479 +++
Filing this in advance to start chaining deps for release-ing 64.0
Comment 1•6 years ago
|
||
To help avoid some of the merge day timeouts we've been hitting, we're currently thinking:
- the week before the initial merge day, ping #vcs / file a bug to push the head of beta to mozilla-release, and the head of central to mozilla-beta. These may need to be named branches to bypass the single head hook.
- we probably want to kill the resulting push's builds, or push with an empty DONTBUILD commit.
- in theory, that means the merge day push will contain many weeks' fewer commits, speeding up the hooks.
We may want to verify that the merge day script deals well with this scenario... Callek, Simon, Aki, or some other volunteer should probably test this against a user repo beforehand:
- populate a user repo with beta
- push the head of central to it as a new branch
- run the merge day script, dry run, against central + user-repo-beta
- verify the push to user-repo-beta does the right thing (only push the new commits)
Pretty sure this will work. This isn't our target end state, but this could get us to smoother merge days at relatively low cost until we can implement a long term solution.
Comment 2•6 years ago
|
||
gps, sheehan Do you have any concerns with this approach? We could setup a meeting to discuss with releng as needed to ensure we are on the same page.
Flags: needinfo?(sheehan)
Flags: needinfo?(gps)
Comment 3•6 years ago
|
||
I believe all known issues related to large pushes failing have been fixed and there should no longer be an issue with large pushes failing going forward.
For reference, the two issues/fixes were:
* Kafka connection timeouts on the server when sending replication messages (bug 1415233)
* SSH connection timeouts due to channel inactivity (bug 1499204)
There's still an issue where some hooks may take a few minutes to run. But this is a perf/optimization issue and not a fundamental reliability issue that jeopardizes releases.
From a release pipeline robustness intersecting with VCS perspective, I think all is now well and no special process or follow-up is needed. I welcome being proved wrong. At which point we'll fix hg.mozilla.org to support large pushes, as large pushes should "just work."
Of course, we may want to pursue non-VCS related changes to improve robustness. But that may be outside the scope of this bug?
Flags: needinfo?(sheehan)
Flags: needinfo?(gps)
Comment 4•6 years ago
|
||
(In reply to Gregory Szorc [:gps] from comment #3)
> I believe all known issues related to large pushes failing have been fixed
> and there should no longer be an issue with large pushes failing going
> forward.
Okay,
Great to see the biggest issues with last few merges have been resolved.
I think the proposal that aki suggests in comment 1 is good but I understand that it involves VCS. If things "just work" that's fine so long as VCS are willing to continue supporting and ensuring that hg.m.o can do large pushes.
One thing that has been challenging in the past is testing this. Particularly the push. As, afaik, even if we were to push to a staging repo, the hooks and configuration will be different than on say m-r or m-b. gps, do you have any thoughts on this? Also, can you confirm that thanks to bug 1415233, we don't need to ask to disable vcsreplicator hooks[0] anymore? I believe that was supposed to be resolved last cycle but we got bit by it again.
[0] https://github.com/mozilla-releng/releasewarrior-2.0/blob/master/docs/mergeduty/howto.md#disable-migration-blocking-hgmo-hooks
Flags: needinfo?(gps)
Comment 5•6 years ago
|
||
> I believe that was supposed to be resolved last cycle but we got bit by it again.
IIRC the hook wasn't disabled this cycle
Comment 6•6 years ago
|
||
I just resolved bug 1415233 because that issue was fixed several weeks ago.
With that issue out of the way, we uncovered a separate issue with the SSH channel timing out. Bug 1499204 landed a permanent fix for that.
With those issues out of the way, I'm quite confident that large pushes should "just work." If they don't, I consider it a P1 bug against hg.mozilla.org.
Regarding the hooks and configuration being different, it is a long-standing issue that people don't have visibility into the repo-specific hgrc modifications that are made on the server.
I think we should bit the bullet and vendor hgrc snippets into version-control-tools so there is a) visibility b) ability for others to change things c) without people editing files directly on the server. I filed bug 1504811 to track this.
That leaves us with intermittent `hg robustcheckout` failures. Please keep filing bugs and chaining them to the tracker for any issues you encounter in release automation.
Flags: needinfo?(gps)
Comment 7•6 years ago
|
||
@gps - so just to be explicit, can we remove this step of asking vcs to remove the ftl and vcsreplicator hooks in our runbook: https://github.com/mozilla-releng/releasewarrior-2.0/blob/master/docs/mergeduty/howto.md#disable-migration-blocking-hgmo-hooks
Flags: needinfo?(gps)
Comment 8•6 years ago
|
||
(In reply to Jordan Lund (:jlund) from comment #7)
> @gps - so just to be explicit, can we remove this step of asking vcs to
> remove the ftl and vcsreplicator hooks in our runbook:
> https://github.com/mozilla-releng/releasewarrior-2.0/blob/master/docs/
> mergeduty/howto.md#disable-migration-blocking-hgmo-hooks
The vcsreplicator hook for sure. The FTL hook, I'm not sure. I consider it a bug if we need to disable hooks to allow legitimate pushes to go through. So if the FTL hook is problematic, we should fix the hook.
I /think/ the last merge day went off without a hitch. So maybe things are good?
Flags: needinfo?(gps)
Updated•6 years ago
|
Summary: Tracking bug for 2018-12-11 migration work → Tracking bug for 2018-12-03 migration work
Comment 9•6 years ago
|
||
(In reply to Gregory Szorc [:gps] from comment #8)
> (In reply to Jordan Lund (:jlund) from comment #7)
> > @gps - so just to be explicit, can we remove this step of asking vcs to
> > remove the ftl and vcsreplicator hooks in our runbook:
> > https://github.com/mozilla-releng/releasewarrior-2.0/blob/master/docs/
> > mergeduty/howto.md#disable-migration-blocking-hgmo-hooks
>
> The vcsreplicator hook for sure. The FTL hook, I'm not sure. I consider it a
> bug if we need to disable hooks to allow legitimate pushes to go through. So
> if the FTL hook is problematic, we should fix the hook.
>
> I /think/ the last merge day went off without a hitch. So maybe things are
> good?
I'm going to do this mondays merge with the expectation that we will not disable any hooks, should any be needed I'll both n-i here and ping in #vcs to get that done (failing a response by evening I may seek out greg or connor in person)
Comment 10•6 years ago
|
||
beta->release, central->beta, esr version bump and l10n-bumper are done.
Reporter | ||
Comment 11•6 years ago
|
||
I think it's safe to close this bug. Feel free to reopen if I missed anything.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•