all tree closed, changelogs broken or not updating, no ingestion possible by Treeherder
Categories
(Developer Services :: Mercurial: hg.mozilla.org, defect)
Tracking
(Not tracked)
People
(Reporter: aryx, Assigned: sheehan)
Details
https://hg.mozilla.org/try/ shows an empty changelog. Consequence: https://treeherder.mozilla.org/#/jobs?repo=try doesn't show more recent pushes which the server still accepts.
Reporter | ||
Updated•5 years ago
|
Reporter | ||
Updated•5 years ago
|
Comment 1•5 years ago
|
||
From IRC #vcs (timezone ET):
9:36 AM <hg-deploy-bot> Started deploy of revision 25c9bed28326 to hg.mozilla.org; previous 842df4a82218
10:32 AM <pulsebot> Check-in: https://hg.mozilla.org/hgcustom/version-control-tools/rev/83cdb0bd4dcf - Connor Sheehan - ansible/hg-ssh-server: change scm_allow_direct_push
gid to 692 (Bug 1515119) r=glob
11:19 AM <hg-deploy-bot> Finished deploy of hooks and extensions to hg.mozilla.org
Assignee | ||
Comment 2•5 years ago
|
||
I'm looking now. This morning's deploy shouldn't have caused this issue but I doubt it's a coincidence.
Also, wrong Lars. :)
Assignee | ||
Comment 3•5 years ago
|
||
This should be fixed now. The problem here was related to some stale config in the recent deploy to hgmo. The replication system has a process that monitors all relevant Kafka consumer groups to hide changesets from public view until all mirrors have received the changeset. The purpose of this is to avoid having one mirror display a changeset (during an hg pull
for example), while another mirror has not pulled down the new changeset yet. We use a file to track the relevant Kafka consumer groups. The names of the Kafka consumer groups are derived from the local hostnames for the given host. This is done during runs of Ansible playbooks (Ansible pulls the hostname into a "facts" object, which we use to specify a hostname).
I updated the hostnames locally on the hgweb mirrors, since the current hostname was simply the private IP address converted to a dash-separated string, and I wanted something more verbose. However I did not update the groups file, so the Ansible deployment this morning caused the Kafka consumer group name to be overwritten. Then the replication consistency process was waiting for the now-dead consumer group to acknowledge messages, which it would never do.
I've removed the bad Kafka group names from the file manually, and I'll be pushing the updated file to v-c-t shortly.
Sorry for the inconvenience!
Description
•