Closed Bug 1740791 Opened 11 months ago Closed 10 months ago

Lando was unable to land any revisions early morning EST November 11 2021

Categories

(Conduit :: Lando, defect)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dkl, Assigned: zeid)

Details

Attachments

(1 file)

Slack Log:
jbuck @dkl is there a bug for lando not landing?
I restarted the landing worker, hopefully that get us unstuck for now
dkl: Sorry do not know of one nor did I create one. We can certainly do so
jbuck: yeah, lets create one for tracking
I think the restart did the trick
I see patches… trying to land
however none are applying cleanly, boo
dkl: That's great. I did not see anything related in the logs but i might not have looked at the right one
jbuck: "Aborting, could not apply patch buffer for 130921, 506963."
yeah, I didn’t see anything in the logs either… just kinda stopped working?
dkl: Yeah just saw that on one of the ones I was looking at
sometimes containers hang. used to happen a lot with the push daemon for BMO
are any other revisions applying cleanly or are all failing?
ryanvm is trying to reland
jbuck: I’m not seeing any revisions applying cleanly
dkl: are there repo watchers that are hung that keep the copies up to date?
i dont know the architecture of the landing system personally
I know that phabricator does have background tasks that look for changes in upstream repos and keeps them synced
jbuck: they should be synced, but yeah, something to check
jbuck: @dkl is there a channel I should be in to provide updates to ryanvm and the others? or you’ll do that
jbuck: okay, I paused the landing work while I look at the hg repos
(by running lando-cli run-pre-deploy-sequence)
here’s the current state of the m-c repo on disk
app@api-prod-lando-landingworker-1-0:/repos/mozilla-central$ hg summary
parent: 598979:f69f96e51a46
Merge mozilla-central to autoland on a CLOSED TREE
branch: default
commit: (clean)
update: 16 new changesets (update)
phases: 16 draft
okay
I think the problem is that this copy of m-c has extra commits that regular m-c doesn’t
dkl: @ryanvm is there a better channel for this discussions
ryanvm: my read on this is that Lando didn't fully land anba's patches on autoland
and is now in a corrupted state
jbuck: yeah
ryanvm: |hg recover| might work, but I'm honestly not sure I fully understand the implications of doing that
vs. blowing it away and just pulling a fresh copy of autoland
jbuck: got it, I can do that
(blowing away the repo, that is)
ryanvm: what bug # was anba's push under?
my past experience with hg recover is that it takes forever with not always good results
jbuck: okay, I’ll move the busted m-c copy
and restart the landing working which will clone a fresh m-c
ryanvm: sgtm
dkl: could the failed landing have caused all other landings to back up as well?
ryanvm: yeah, I could see that
the repo is in a corrupted state
jbuck: okay, restarting now
ryanvm: I'm very interested to know what bug broke things
jbuck: Bug 1738422
ryanvm: anba is often doing things with i18n and stuff and I wonder if he hit a weird mercurial edgecase or something
thanks
ugh, ICU
oof, that's a big pile of patches and I bet at least some of them are very large
jbuck: okay, it’s re-cloning m-c now
I kept the busted repo around as a rename if it’s interesting to look at the repo state
ryanvm: I really wonder what'll happen if we try to land that ICU stack again
jbuck: part of me wants to see if it can wait until monday when zeid is back :sweat_smile:
because yeah, it’s weird that this patch series just wedged lando entirely
it just stopped doing stuff
ryanvm: yeah, if this gets us back to good, I can reach out to anba and ask him to hold off
jbuck: I think that’d be good yeah
Aryx: thank you all, in the past Lando rejected patches too big to land, maybe this time it's even bigger and fails at an earlier stage? gotta go afk for a more minutes, will respond to eventual questions afterwards
ryanvm: I feel like ICU may have been a testcase before for Lando & Phab's limits :laughing:
jbuck: haha
okay, it cloned m-c
app@api-prod-lando-landingworker-1-0:/repos/mozilla-central$ hg summary
parent: 598979:f69f96e51a46 tip
Merge mozilla-central to autoland on a CLOSED TREE
branch: default
commit: (clean)
update: (current)
that looks much better
I’ll unpause landing now and we’ll see what happens
unpaused
@ryanvm you could try landing a patch now
ryanvm: https://lando.services.mozilla.com/D130973/ is on its way hopefully
jbuck: there it goes
I think it worked
ryanvm: i see new pushes to autoland :success-kid:
jbuck: :catjam:
ryanvm: it's a 30MB diff
so yeah, I definitely think we'll want to have zeid around next time to see if there's anything we can do to handle this better
otherwise we may need to manually import the patches and push them to autoland (though that doesn't seem like a great option in the long run)

Assignee: nobody → zeid
Status: NEW → ASSIGNED
Pushed by zzabaneh@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/79bceef9f7f2
temporarily disable autoformatting r=sheehan DONTBUILD

Quick update here, the original stack that caused the blockage has landed without issue with autoformatting disabled.

Status: ASSIGNED → RESOLVED
Closed: 10 months ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.