Closed Bug 1171627 Opened 9 years ago Closed 9 years ago

Operational alerts for pipeline health (stackdriver/datadog)

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: kparlante, Assigned: relud)

Details

(Whiteboard: [unifiedTelemetry][40b9])

Katie Parlante

Reporter

Description

•

9 years ago

Including but not limited to:
- network use
- memory use

Mark Reid [:mreid]

Updated

•

9 years ago

Priority: -- → P1

Mark Reid [:mreid]

Updated

•

9 years ago

Whiteboard: [unifiedTelemetry][b5]

Daniel Thorn [:relud]

Assignee

Updated

•

9 years ago

Assignee: whd → dthornton

Daniel Thorn [:relud]

Assignee

Comment 1

•

9 years ago

we now have alerts in place for disk usage, memory usage, and ntp drift.

Status: NEW → ASSIGNED

Daniel Thorn [:relud]

Assignee

Comment 2

•

9 years ago

the remaining alerts i'm going to configure are when instance sizes are reaching bandwidth limits, and when 5xx's on the elb are too high.

Thomas Huelbert

Updated

•

9 years ago

Whiteboard: [unifiedTelemetry][b5] → [unifiedTelemetry][40b9]

Daniel Thorn [:relud]

Assignee

Comment 3

•

9 years ago

elb alerts are in place, bandwidth alerts are going to be more difficult and i'm still working out how to accomplish those.

Thomas Huelbert

Updated

•

9 years ago

Iteration: --- → 42.3 - Aug 10

Thomas Huelbert

Updated

•

9 years ago

Iteration: 42.3 - Aug 10 → 43.1 - Aug 24

Wesley Dawson [:whd]

Comment 4

•

9 years ago

We haven't had an issues for the last few months that went undetected, so I'll call this done. We can configure bandwidth alerts if they become relevant, probably 42.

Status: ASSIGNED → RESOLVED

Closed: 9 years ago

Resolution: --- → FIXED

BMO Automation

Updated

•

6 years ago

Product: Cloud Services → Cloud Services Graveyard

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Operational alerts for pipeline health (stackdriver/datadog)

Categories

(Cloud Services Graveyard :: Metrics: Pipeline, defect, P1)

Tracking

(Not tracked)

People

(Reporter: kparlante, Assigned: relud)

References

Details

(Whiteboard: [unifiedTelemetry][40b9])

Crash Data

Security

(public)

User Story

Description

Updated

Updated

Updated

Comment 1

Comment 2

Updated

Comment 3

Updated

Updated

Comment 4

Updated