Closed Bug 1853965 Opened 2 years ago Closed 2 years ago

make "timed out waiting for lock on try repo" a temporary failure

Categories

(Conduit :: Lando, enhancement)

enhancement

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: sheehan, Assigned: sheehan)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

When try load is high, pushes to try via Lando can fail due to timeouts, after which they will not be retried. We should mark these failures as temporary so they are retried.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED

This has just happened to me:

Login successful.
Auth0 token validated.
Using d2acd6e78b18b976d2682f71220ffd0057c37725 as the hg base commit.
Submitting stack of 6 nodes.
Patches gathered for submission.
Submitting patches to Lando.
Lando try submission success, took 3.68 seconds. Landing job id: 67600

and a few minutes later, an email:

Your request to land Bug 1854654 - Add armhf/aarch64 cross-compilation for Snap Upstream failed.

See https://lando.services.mozilla.com/Bug 1854654 - Add armhf/aarch64 cross-compilation for Snap Upstream/ for details.

Reason:
Unexpected error while pushing to ssh://hg.mozilla.org/try.
hg error in cmd: hg push -r tip ssh://hg.mozilla.org/try -f: pushing to ssh://hg.mozilla.org/try
searching for changes
remote: waiting for lock on working directory of /repo/hg/mozilla/try held by process '10704' on host 'hgssh1.dmz.mdc1.mozilla.com/effffffc'
remote: abort: working directory of /repo/hg/mozilla/try: timed out waiting for lock held by 'hgssh1.dmz.mdc1.mozilla.com/effffffc:17560'

remote: Warning: Permanently added 'hg.mozilla.org' (ED25519) to the list of known hosts.
abort: stream ended unexpectedly (got 0 bytes, expected 4)
Status: RESOLVED → REOPENED
Flags: needinfo?(sheehan)
Resolution: FIXED → ---

Our convention is to close a bug when the changes are merged, and not necessarily deployed. This will be deployed with the next Lando release. Definitely could be a little confusing, but you can keep track of releases here: https://github.com/mozilla-conduit/lando-api/releases.

Status: REOPENED → RESOLVED
Closed: 2 years ago2 years ago
Resolution: --- → FIXED

(In reply to Zeid Zabaneh [:zeid] from comment #3)

Our convention is to close a bug when the changes are merged, and not necessarily deployed. This will be deployed with the next Lando release. Definitely could be a little confusing, but you can keep track of releases here: https://github.com/mozilla-conduit/lando-api/releases.

I was told one week ago (on #Conduit on Matrix) that this was getting deployed, I did a lot of pushes in the past days and none failed at all. So I'm a little bit wondering then.

(In reply to :gerard-majax from comment #4)

(In reply to Zeid Zabaneh [:zeid] from comment #3)

Our convention is to close a bug when the changes are merged, and not necessarily deployed. This will be deployed with the next Lando release. Definitely could be a little confusing, but you can keep track of releases here: https://github.com/mozilla-conduit/lando-api/releases.

I was told one week ago (on #Conduit on Matrix) that this was getting deployed, I did a lot of pushes in the past days and none failed at all. So I'm a little bit wondering then.

Sorry for the confusion, I have deployed this change to our development servers but not to our production servers. Older versions of the --push-to-lando patch submitted to the development servers, but I have since updated the patch to submit to production Lando.

I'll deploy the change to production Lando on Monday. If you're hitting those errors frequently, you can set LANDO_TRY_USE_DEV in the environment to submit to the development Lando servers again.

Flags: needinfo?(sheehan)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: