Closed Bug 878049 Opened 12 years ago Closed 12 years ago

Create a persistent history of slave reboot attempts and outcomes for kittenherder

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: coop, Assigned: coop)

References

Details

(Whiteboard: [slaveduty][dashboard][kittenherder])

Chris Cooper [:coop] (he/him)

Assignee

Description

•

12 years ago

kittenherder doesn't currently maintain a history of reboot attempts for a given slave, i.e. if the same slave appears in the slaves_needing_reboot.txt list 6 hours later, kittenherder will merrily try to reboot the slave again. This is a great opportunity for kittenherder to recognize a pattern in slave behavior and file an appropriate bug (bug 859403), but in order to do so, we need to start tracking reboot attempts in a persistent manner, and possibly also double-checking whether our reboot attempts were successful before waiting for the next cycle. If we begin tracking state this way, it may allow us to iterate more quickly over the list of slaves needing reboot because we won't spend time on slaves that are in a known bad state. If at all possible, the reboot history should be kept in a format (and location) that is easily digestible by other reporting tools, e.g. slave_health.

Chris Cooper [:coop] (he/him)

Assignee

Updated

•

12 years ago

Blocks: 878051

Chris Cooper [:coop] (he/him)

Assignee

Updated

•

12 years ago

Assignee: nobody → coop

Status: NEW → ASSIGNED

Priority: -- → P2

Armen [:armenzg]

Comment 1

•

12 years ago

I believe this helps buildduty but I will remove the tag to get it out of the buildduty query.

Whiteboard: [buildduty][slaveduty][dashboard][kittenherder] → [slaveduty][dashboard][kittenherder]

Chris Cooper [:coop] (he/him)

Assignee

Updated

•

12 years ago

Component: Release Engineering: Machine Management → Release Engineering: Developer Tools

QA Contact: armenzg → hwine

Chris Cooper [:coop] (he/him)

Assignee

Comment 2

•

12 years ago

https://github.com/mozilla/briar-patch/commit/5c701aaa0361978d9e576e91a675aacec47c871d It doesn't track outcomes, butI'm not sure how we would properly verify that unless we looped on slave state after a reboot attempt. Reboot commands can return success without actual yielding a functional machine out the other side. We can track this based on subsequent reboot attempts though, especially if we start iterating more quickly than every 6 hours.

Status: ASSIGNED → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Nobody; OK to take it and work on it

Updated

•

12 years ago

Product: mozilla.org → Release Engineering

Nobody; OK to take it and work on it

Updated

•

8 years ago

Component: Tools → General

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Create a persistent history of slave reboot attempts and outcomes for kittenherder

Categories

(Release Engineering :: General, defect, P2)

Tracking

(Not tracked)

People

(Reporter: coop, Assigned: coop)

References

Details

(Whiteboard: [slaveduty][dashboard][kittenherder])

Crash Data

Security

(public)

User Story

Description

Updated

Updated

Comment 1

Updated

Comment 2

Updated

Updated