Closed
Bug 806337
Opened 12 years ago
Closed 12 years ago
BMM: Lock relay boards and add retries
Categories
(Infrastructure & Operations :: RelOps: General, task)
Infrastructure & Operations
RelOps: General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: dustin, Assigned: dustin)
Details
I need to test that relays support concurrent access. If not, they need to be locked. In either case, they need to be retried and status verified to ensure that a "reboot" actually reboots the board.
Assignee | ||
Updated•12 years ago
|
Assignee: server-ops-releng → dustin
Comment 1•12 years ago
|
||
As noted at https://bugzilla.mozilla.org/show_bug.cgi?id=806152#c9 , we should assume the relay boards do not handle multiple connections so locks and retries should be implemented in bmm as suggested. BMM/lifeguard/mozpool should also be the only place to properly request a panda board reboot, whether via human or code, so states are maintained properly.
Assignee | ||
Comment 2•12 years ago
|
||
I'm working on this right this very instant. The relay code already checks the status after power-off and again after power-on, so with a bit of locking this should be good to go. I'm also going to add some short socket timeouts so we're not stuck waiting to talk to a relay that's not there.
Comment 3•12 years ago
|
||
We might also want to add a tcp connect timeout to the server side.
Assignee | ||
Comment 4•12 years ago
|
||
Timeouts are done: http://hg.mozilla.org/build/bmm/file/b032b833c3cf/mozpool/bmm/relay.py I used asyncore so that all socket operations are async, and we can timeout waiting for them. So connection delays, socket delays, and so on are all handled appropriately, and we can pretty much guarantee that as long as the local CPU isn't tied up, the relay functions will return within their timeout. Locking is in there, too, but much simpler: http://hg.mozilla.org/build/bmm/file/b032b833c3cf/mozpool/bmm/relay.py#l65
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Component: Server Operations: RelEng → RelOps
Product: mozilla.org → Infrastructure & Operations
You need to log in
before you can comment on or make changes to this bug.
Description
•