Closed Bug 1582383 Opened 6 years ago Closed 6 years ago

Lando services down - Could not Communicate with Lando API

Categories

(Conduit :: Lando, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: glob, Assigned: ckolos)

Details

Lando UI is reporting Could not Communicate with Lando API for all revisions.

Lando API is returning HTTP/502 for all requests.

GKE failed to auto-repair a failed node in the NAT node-pool required for API communication with transplant. Attempting to re-spin the node pool.

Blocked on persistent GKE failures when launching nat-pool. Escalating as soon as I can get access to support role.

final update:

A legacy NAT configuration was causing the issue of a new node-pool failing. Deleting all nat configs (pool, routes, fw rules, etc) and reapplying managed to recreate the cluster. A redeploy of already-in-prod code was then don and the API endpoint is no longer failing as well as lando ui returning healthy on its heartbeat.

My apologies for the delay on this.

Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED

Attempts to reland fail with


    The requested landing_path is valid, but transplant failed.Please retry your request at a later time.```

From Standard8: "It gave me an ISE after submitting the landing on the first time I ran it. Then it comes up with that error"
Status: RESOLVED → REOPENED
Resolution: FIXED → ---

It's bac to "Could not Communicate with Lando API" - not sure if that is due to ongoing work.

Severity: normal → blocker

<ckolos> I've been working on the issue on/off since 2 AM my time.
There's something wrong on the GCP side
I can get it to recover, but it falls over soon after.
I've engaged with support @ GCP and will update when I know more. Meantime, if you're behind SSO, you can follow https://firefoxoperations.statuspage.io
<habib> we have escalated this with Google

autoland reopened when this bug and bug 1582383 got resolved at 17:58 UTC.

Thank you for fixing it.

Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.