- remove graceful-restart from: Why?: There are a few reasons to use graceful-restart. The main reason is to be able to use graceful-routing-engine-switchover (GRES). That allows us to switch from a primary routing-engine to a backup, and not drop packets. However, none of our border routers and few of our core switches have a backup routing-engine. graceful-restart is still useful with only 1 routing-engine. It allows us to restart routing protocols without disrupting traffic. However, we don't tend to restart routing-protocols or the RPD process. As well, we are interested in deploying bi-directional forwarding detection (BFD) which conflicts with graceful-restart. i.e. you should only have one of the two configured when using BGP So, we'd like to remove graceful-restart from all the devices in our network that currently have it configured: + agg1.s301.ops.phx1.mozilla.net + border1.console.pao1.mozilla.net + border1.console.scl3.mozilla.net + border1.console.sjc2.mozilla.net + border1.phx1.mozilla.net + border2.console.scl3.mozilla.net + border2.phx1.mozilla.net + core1.corp.console.scl3.mozilla.net + core1.corp.phx1.mozilla.net + core1.svc.phx1.mozilla.net + fw1.akl1.mozilla.net + fw1.corp.console.scl3.mozilla.net + fw1.corp.phx1.mozilla.net + fw1.lon1.mozilla.net + fw1.ops.par1.mozilla.net + fw1.ops.pdx1.mozilla.net + fw1.ops.scl1.mozilla.net + fw1.phx1.mozilla.net + fw1.releng.scl3.mozilla.net + fw1.scl3.mozilla.net + fw1.sfo1.mozilla.net + fw1.svc.phx1.mozilla.net + fw1.tor1.mozilla.net + switch1.r101-10.ops.scl3.mozilla.net + switch1.r301-10.ops.scl3.mozilla.net There is no documentation on the impact of removing graceful-restart from a switch, router, or firewall configuration. While graceful-restart is configured on these devices, it is *not* configured as part of any protocol configuration. Worst case: protocol adjacencies will be cleared when this configuration line is removed. Best case: nothing will happen when this configuration line is removed. Either way, we'll do this change one device at a time, making sure that the network is in a good working state before moving on to the next device. Total Maintenance Time: 2 hours Expected Impact: A series of short periods of routing churn
Assignee: network-operations → dcurado
Approved by the CAB on July 23rd. When are we doing this Dave?
Flags: cab-review? → cab-review+
We removed graceful restart from the remote office firewalls and some switches. As mentioned above, there is no documentation from Juniper about the impact of removing graceful restart. What we learned is that is probably restarts the Routing Protocol Daemon, aka RPD. That means all protocols restart. That means all BGP sessions restart. Rather than wreaking temporary havoc on the data centers by clearing all the BGP sessions there, we opted to wait until the upcoming TCW to do that. We want to clean this stuff up, but there is no need to cause problems in order to do so.
Graceful restart has been unconfigured from all of our equipment except border1.sjc2. We'll have to take care of that some time, but making this change causes a long a disturbing re-convergence time for the entire network. We made this change to border1.pao1, and it took a long time to reconverge. Not wanting to do that twice in one day, we left border1.sjc2 configured with graceful-restart for now.
Status: ASSIGNED → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.