Closed Bug 1495460 Opened 6 years ago Closed 6 years ago

Avoid describe-clusters call in telemetry-sh

Categories

(Data Platform and Tools :: General, enhancement, P1)

enhancement
Points:
2

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: klukas, Assigned: klukas)

Details

(Whiteboard: [DataPlatform])

Attachments

(1 file)

As :whd warned, calling aws emr describe-clusters in telemetry.sh is not reliable and sometimes fails. We had an airflow job fail bootstrap due to this. We could add retry logic around the request, or we could add extra parameters to the bootstrap script to provide the metadata we're pulling from describe-clusters.
Assignee: nobody → jklukas
Status: NEW → ASSIGNED
Points: --- → 2
Priority: -- → P1
Whiteboard: [DataPlatform]
Attachment #9014919 - Flags: review?(fbertsch)
PR posted. I experimented with adding retry logic and decided it wasn't too unreasonable an amount of code to implement retries. I've decided to take that route rather than sending in extra parameters, which would complicate the interface. The EMR API needs to be available for us to create clusters anyway, so it it seems reasonable that we should expect to be able to get a response in this script. It's also reasonable to expect that an HTTP call will lead to occasional transient failures and needs to be wrapped.
Attachment #9014919 - Flags: review?(fbertsch) → review+
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Component: Spark → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: