Closed Bug 785282 Opened 13 years ago Closed 11 years ago

clean shutdown/restart strategy for sync2.0

Tracking

(Not tracked)

Status:

VERIFIED FIXED

People

(Reporter: rfkelly, Assigned: rfkelly)

Details

(Whiteboard: [qa+])

Ryan Kelly [:rfkelly]

Assignee

Description

•

13 years ago

Bug 623604 described some facilities for cleanly shutting down or restarting the sync1.1 server. I need to investigate them and make sure we have similar facilities in sync2.0.

Ryan Kelly [:rfkelly]

Assignee

Updated

•

13 years ago

Summary: cleanup shutdown/restart strategy for sync2.0 → clean shutdown/restart strategy for sync2.0

James Bonacci [:jbonacci]

Updated

•

13 years ago

Whiteboard: [qa+]

Ryan Kelly [:rfkelly]

Assignee

Comment 1

•

13 years ago

CC'ing Bob onto this bug as it's now something we need to think about sooner rather than later. If we're running the latest version of gunicorn, then a simple SIGHUP should be enough to cleanly reset its state after an update: it will reload the configuration file, span a new set of workers, then kill of the old workers. We should rarely need to kill the gunicorn master process. Bob, does the special SIGTERM and __heartbeat__ handling from Bug 623604 seem necessary? I feel like this is better handled by just yanking the machine from the loadbalancer explicitly, rather than having it try to signal its own way out of the loadbalancer at shutdown.

Bob Micheletto [:bobm]

Comment 2

•

13 years ago

Will the more elaborate parenting, of the newer version of gunicorn, allow the children worker processes to finish handling their traffic before killing them? If so, we might experiment with the idea of not removing the machine from the load balancer during the upgrade. We would want to work out the testing procedure in this case; perhaps, start a new worker on a different port first, etc. In either case, it seems switching from a SIGTERM to a SIGHUP should suffice for the server-storage application restart. Explicitly pulling a webhead from the load balancer will guarantee the level of control and verifiability one would expect from a manual deploy. However, from an operational perspective, fully automated hands-free deployments are the ideal. In that case, the ability for a server to remove itself from a load balancer may be necessary. The ability to push code manually is mandatory, and can be done without the __heartbeat__ check. If we want to push for the operational ideal, nice-to-have, fully automated, hands-free deployments we may want to consider using it.

Ryan Kelly [:rfkelly]

Assignee

Comment 3

•

13 years ago

(In reply to Bob Micheletto [:bobm] from comment #2) > Will the more elaborate parenting, of the newer version of gunicorn, allow > the children worker processes to finish handling their traffic before > killing them? Yes, I believe this is the case - they are blocked from taking new requests, but allowed to finish processing any that are already in-progress. I would have to double-check the source to make sure. > In either case, it seems switching from a SIGTERM to a SIGHUP should suffice > for the server-storage application restart. Explicitly pulling a webhead > from the load balancer will guarantee the level of control and verifiability > one would expect from a manual deploy. However, from an operational > perspective, fully automated hands-free deployments are the ideal. In that > case, the ability for a server to remove itself from a load balancer may be > necessary. I'm not suggesting to do it by hand, but rather to have the deployment script talk directly to the load balancer to enable/disable each webhead. I believe the current syncpush script has such a mechanism already..?

Ryan Kelly [:rfkelly]

Assignee

Comment 4

•

11 years ago

Bob, do you want to keep this bug open or are we different enough in the new setup that it's of no further use?

Flags: needinfo?(bobm)

James Bonacci [:jbonacci]

Comment 5

•

11 years ago

If so, let's tie this into bug 1006792

Bob Micheletto [:bobm]

Comment 6

•

11 years ago

(In reply to Ryan Kelly [:rfkelly] from comment #4) > Bob, do you want to keep this bug open or are we different enough in the new > setup that it's of no further use? I think we are different enough. Reboot / termination == migration in the brave new world of Sync 1.5

Status: NEW → RESOLVED

Closed: 11 years ago

Flags: needinfo?(bobm)

Resolution: --- → FIXED

James Bonacci [:jbonacci]

Comment 7

•

11 years ago

Status: RESOLVED → VERIFIED

BMO Automation

Updated

•

3 years ago

Product: Cloud Services → Cloud Services Graveyard

You need to log in before you can comment on or make changes to this bug.

Bugzilla

clean shutdown/restart strategy for sync2.0

Categories

(Cloud Services Graveyard :: Server: Sync, defect)

Tracking

(Not tracked)

People

(Reporter: rfkelly, Assigned: rfkelly)

References

Details

(Whiteboard: [qa+])

Crash Data

Security

(public)

User Story

Description

Updated

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Updated