Closed Bug 581187 Opened 14 years ago Closed 14 years ago

Need a way to mark a back-end server as down

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: zandr, Assigned: telliott)

References

Details

(Whiteboard: [qa-])

Zandr Milewski [:zandr]

Reporter

Description

•

14 years ago

Back-end databases often need to go down for maintenance or repair.

We need a way to tell the system that this has happened. As it is, we have a couple of bad behaviors associated with simply taking down a back-end DB host.

We may still assign nodes to the down server. Mitigating this in the current system requires removing the nodes from node_config.json (which is json, therefore you can't comment it out) and setting `ct` to 0 in the available_nodes table on the admin host. A hackish way to do it is simply to crank up the actives on the node, which will keep new assignments from happening until approximately 1am, but could pollute metrics. If a new user gets 503'd, we think they get 'unknown' error. This is untested.

If the host is all the way down, instead of merely refusing MySQL connections, then the webheads run out of apache processes. This is because the MySQL connection timeout is long. (60s) We can shorten this, but even at 5s, I could see running out of apache processes at high load.

Mitigating this requires repointing the shard_constants entries for the host at something that will refuse the db connection quickly. I've been using 127.0.0.1 for this.

So having a single place to flag a server as 'down' that will avoid these ugly behaviors would be extremely valuable to ops.

Mike Connor [:mconnor]

Updated

•

14 years ago

Blocks: 592376

Toby Elliott [:telliott]

Assignee

Comment 1

•

14 years ago

http://hg.mozilla.org/services/reg-server-secure/rev/8c06e8ab82cb

allows us to mark nodes as downed.

Status: NEW → RESOLVED

Closed: 14 years ago

Resolution: --- → FIXED

Zandr Milewski [:zandr]

Reporter

Updated

•

14 years ago

Blocks: 598959

Tracy Walker [:tracy]

Updated

•

14 years ago

Whiteboard: [qa-]

BMO Automation

Updated

•

1 year ago

Product: Cloud Services → Cloud Services Graveyard

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Need a way to mark a back-end server as down

Categories

(Cloud Services Graveyard :: Server: Sync, defect)

Tracking

(Not tracked)

People

(Reporter: zandr, Assigned: telliott)

References

Details

(Whiteboard: [qa-])

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Updated

Updated

Updated