Try repo is having issues.

NEW
Unassigned

Status

Developer Services
Mercurial: hg.mozilla.org
4 months ago
4 months ago

People

(Reporter: KWierso, Unassigned)

Tracking

(Blocks: 1 bug)

Details

(Reporter)

Description

4 months ago
From #developers:
08:58:28 <bobowen> is anyone else having trouble pushing to try?
08:59:34 <%KWierso> bobowen: it was closed earlier for troubles with the pushlog being ingested
08:59:42 <%KWierso> but iirc, that passed and things reopened
09:00:04 <%KWierso> define "trouble"? :)
09:00:55 <bobowen> KWierso: well initially I got: remote: waiting for lock on working directory of /repo/hg/mozilla/try held by process '30618' on host 'hgssh4.dmz.scl3.mozilla.com/effffffc'
09:00:55 <bobowen> remote: abort: working directory of /repo/hg/mozilla/try: timed out waiting for lock held by 'hgssh4.dmz.scl3.mozilla.com/effffffc:30618'
09:01:34 <bobowen> KWierso: now I'm getting: remote: abort: abandoned transaction found! remote: (run 'hg recover' to clean up transaction)
09:01:49 <Standard8> KWierso: bobowen: I’ve just seen a bug with https://bugzilla.mozilla.org/show_bug.cgi?id=1386684#c5


From #vcs:
09:03:03 <KWierso> gps/fubar: seeing reports that people are having trouble pushing to try
09:03:18 <KWierso> bobowen in #developers said his push timed out waiting for lock
09:03:39 <KWierso> > working directory of /repo/hg/mozilla/try: timed out waiting for lock held by 'hgssh4.dmz.scl3.mozilla.com/effffffc:30618'
09:03:53 <KWierso> and the autolander had a similar problem in https://bugzilla.mozilla.org/show_bug.cgi?id=1386684#c5
09:03:57 <firebot> Bug 1386684 — NEW, dbugs@thebanners.uk — Enable ESLint for toolkit/components/url-classifier
09:05:07 <KWierso> haven't seen any recent alerts here or in #buildduty so I don't know if this is more widespread and warrants a closure
09:08:32 <~fubar> looks like hskupin had a push to try that took a long, long time
09:09:08 <~fubar> 15:21Z to 15:41Z
09:10:12 <~fubar> there are no extant locks, and subsequent pushes look to have completed quickly
09:10:57 <~fubar> 2017/08/07 15:41:36 hskupin@mozilla.com @0000000000000000000000000000000000000000 (30618)> -R /repo/hg/mozilla/try serve --stdio exited -1 after 1226.10 seconds
09:11:52 <~fubar> hrm, pushes after that are taking ~12 seconds, whereas before they were much faster
09:19:04 <KWierso> fubar: hrm, looks like there's an abandoned transaction on the server's side, as bobowen is still unable to push
09:19:21 <KWierso> the error says to run hg recover, but running that locally says there's no aborted transactions to recover
09:27:14 <Standard8> I’ve gotta head to dinner, but if anyone can help with the pushing to try issues of autoland in https://bugzilla.mozilla.org/show_bug.cgi?id=1386684#c5 and comment 6, that’d be great
09:27:17 <firebot> Bug 1386684 — NEW, dbugs@thebanners.uk — Enable ESLint for toolkit/components/url-classifier
09:27:18 <Standard8> bbiab
09:46:00 <gps> fubar: what's the status of things?
09:49:28 <~fubar> looks like try is still somewhat unhappy and needs a recover. I need to go grab some food asap, though
09:53:40 <gps> we've had multiple soft corruptions since upgrading to 4.2, ugh




Filing this to have something to point at when I close Try.
(Reporter)

Comment 1

4 months ago
Looks like things have recovered, so I reopened try and pushes are going through. Dunno if we want to leave this open for investigating those "multiple soft corruptions".

Comment 2

4 months ago
Bug 1387324 was likely the first report of "soft corruption" due to an interrupted push. It definitely seems to be correlated to the Mercurial 4.2 upgrade last week, since we've had no reports of repos getting wedged due to incomplete transaction for months.

Updated

4 months ago
Blocks: 1359641
You need to log in before you can comment on or make changes to this bug.