Closed Bug 970487 Opened 9 years ago Closed 8 years ago
make sure hg webheads are in sync
Filing a bug after the issue has been resolved, for tracking purposes. Aiui: * We announced we were adding a new webhead to hg.m.o at 8am PST Monday, Feb 10. The announcement went out Friday afternoon. * The new webhead went live without additional notifications, so there was confusion as to whether things were still in flux. * We had several (7) ISE 500 related error emails from new vcs-sync in the period between 3:29am - 7:37 am PST. This isn't hugely out of the ordinary, sadly. * We started getting 404 errors for certain hg.m.o/releases/ repos starting at 8:03am PST. ** 8:03 mozilla-b2g28_v1_3t ** 8:06 mozilla-b2g26_v1_2f and mozilla-b2g28_v1_3t ** 8:08 mozilla-b2g26_v1_2f and mozilla-b2g28_v1_3t and mozilla-b2g28_v1_3 ** 8:24, 8:25, 8:53 mozilla-b2g26_v1_2f ** 9:22 mozilla-b2g28_v1_3t and mozilla-b2g28_v1_3 ... lots more The 404s were new, but I didn't know that the webhead install was done, so I waited to report til the all-clear. At 10:30 bkero let me know that the webhead add was done, so I reported the issue in IRC, and we were able to resolve by 11:00. The underlying issue seems to be that we're listing these repositories manually per webhead, and nothing seems to be syncing those, so they're out of sync.
Per comment 0, the webheads seem to be manually set up, and can get out of sync. We're seeing a lot of intermittent issues with hg.m.o, and it's not always clear whether it's a blip, a performance issue, or a webhead configuration issue. If we had some automated way of syncing (puppet?) that would help remove one variable. Morphing the bug.
Summary: retroactive bug: new webhead was missing certain repositories → make sure hg webheads are in sync
The webheads are not manually set up. Their configuration is (as always) entirely under the control of Puppet. You're welcome to look at the puppet module at http://github.com/bkero/puppet-module-hg. The repositories were always synced correctly to the new webhead. The problem was caused by a configuration file that is manually created with each repository was not synced to the webhead. This is because when I did the initial rsync of this set of configuration files the repositories in question did not exist yet. Me and fubar have been discussing solution to the problem, and since new repository creation is so rare we feel that automation through a script that is run every time a repository is created is sufficient. The same is also true with hgweb templates. Once the 'hgweb.config' files were copied to the webhead the repositories appeared in the web interface and were clonable. Does that create understanding?
(In reply to Ben Kero [:bkero] from comment #2) > This is because when I did > the initial rsync of this set of configuration files the repositories in > question did not exist yet. fwiw - I solved a similar problem by putting all config files in a central directory, then installing a symlink to that where the app required the files to exist. > > since new > repository creation is so rare we feel that automation through a script that > is run every time a repository is created is sufficient. fyi, there will be repositories created every release cycle (6 weeks) to support b2g. Still "rare" in an automation sense. > > Does that create understanding? Yes - thanks for the RFO and description of the ongoing mitigation options.
Assignee: server-ops-webops → nobody
Component: WebOps: Source Control → Repos and Hooks
Product: Infrastructure & Operations → Release Engineering
QA Contact: nmaul → hwine
Product: Release Engineering → Developer Services
Whiteboard: [kanban:engops:https://kanbanize.com/ctrl_board/6/155] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/776] [kanban:engops:https://kanbanize.com/ctrl_board/6/155]
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/776] [kanban:engops:https://kanbanize.com/ctrl_board/6/155] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/776]
fixed long ago
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.